Container Security: Image Scanning and Vulnerability Management
Implement comprehensive container security: from scanning images for vulnerabilities to runtime security monitoring and secrets protection.
Containers share the kernel with the host and with other containers. That is not a theoretical risk, it is a practical one. A container escape or a compromised image can give an attacker access to your entire cluster. This is why container security is not optional.
When to Use
Image Signing (Cosign) vs. Not
Use image signing when you deploy to production environments where you need to verify that only images you intentionally built reach your cluster. Signing matters most in multi-team environments where anyone could push to your registry, or when you pull base images from third parties.
Do not use image signing if you are the only person building and deploying images in a small team with a private registry. The operational overhead of key management exceeds the security benefit until you have multiple contributors.
Falco vs. Other Runtime Security Tools
Use Falco when you want open-source runtime security with an active community and Kubernetes-native integration. Falco has the largest rule set community and works well with standard Kubernetes logging.
Use alternatives like Sysdig Falco Enterprise or Aqua Security when you need commercial support, specific compliance framework integrations, or tighter integration with your SIEM.
AppArmor and Seccomp: When They Are Overkill
AppArmor and Seccomp profiles are worth the effort for regulated environments (financial services, healthcare) or for workloads handling sensitive data. The performance overhead is minimal and the blast radius reduction is significant.
Do not invest in custom Seccomp profiles for stateless microservices with no external network access. The operational cost of maintaining profiles exceeds the risk reduction for low-sensitivity workloads. Use the default Docker Seccomp profile instead.
Image Scanning with Trivy and Grype
Scan every image before it touches your cluster. Not sometimes. Not in staging only. Every image, every push, in your CI pipeline.
Trivy is the default choice for most teams. It is fast, has a large vulnerability database, and integrates with most CI systems.
# Install Trivy
brew install trivy
# Scan an image
trivy image myregistry/myapp:latest
# Scan in CI with exit code on high vulnerabilities
trivy image --exit-code 1 --severity HIGH,CRITICAL myregistry/myapp:latest
Grype is another option, particularly if you want to scan SBOMs (Software Bills of Materials) or need a different database backend.
# Install Grype
brew install grype
# Scan with SBOM input
grype sbom:./sbom.json
# JSON output for automation
grype image myregistry/myapp:latest -o json > results.json
Both tools pull from multiple vulnerability databases including the Ubuntu, Debian, and Alpine security feeds, plus the Python Package Index and npm registry.
SBOM Generation and Vulnerability Tracking
An SBOM is a formal record of the packages and dependencies in your software. Think of it as an ingredient list for your container image.
Generate SBOMs at build time:
# Generate SBOM with Syft
syft myregistry/myapp:latest -o spdx-json > sbom.spdx.json
# Or in Dockerfile build with buildpacks
pack build --builder heroku/buildpacks:20 myregistry/myapp:latest
SBOMs serve two purposes. First, when a new vulnerability drops (like Log4Shell), you can query your SBOM database to find every image affected in minutes, not hours. Second, SBOMs give you audit trails for compliance.
Store SBOMs alongside your images in a registry that supports it, or in a separate artifact storage.
Runtime Security with Falco
Scanning images at build time catches known vulnerabilities. Falco catches anomalous behavior at runtime, things that are not in any vulnerability database because they are specific to your environment.
Falco works by monitoring system calls. You define rules for behavior you consider suspicious:
# falco_rules.yaml
- rule: Detect shell in container
desc: A shell was spawned inside a container
condition: >
container and
proc.name = bash
output: >
Shell spawned in container
(user=%user.name container=%container.name
image=%container.image.repository)
priority: WARNING
- rule: Detect crypto mining
desc: Detect execution of known crypto miner
condition: >
spawned_process and
proc.name in (cpuminer, nanominer, ethminer)
output: >
Crypto miner detected
(user=%user.name command=%proc.cmdline)
priority: CRITICAL
Deploy Falco as a DaemonSet in your cluster. It will generate events for every suspicious behavior it sees.
Non-Root Users and Read-Only Root Filesystems
Design your containers to run as non-root by default. This is harder than it sounds because many official images run as root internally.
# Create a non-root user in your Dockerfile
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
# If you must run as root, switch before running the app
USER root
RUN some-privileged-operation
USER appuser
Pair non-root users with read-only filesystems. If an attacker compromises your container, they cannot write to the filesystem.
# Kubernetes pod spec
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10000
You will need to identify which directories need write access and mount them as volumes.
Seccomp and AppArmor Profiles
Seccomp (secure computing mode) restricts the system calls a container can make. By default, containers can make hundreds of system calls. Seccomp lets you whittle that down to the handful your application actually needs.
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": ["SCMP_ARCH_X86_64"],
"syscalls": [
{
"names": ["read", "write", "exit", "sigreturn"],
"action": "SCMP_ACT_ALLOW"
}
]
}
AppArmor works at a higher level, controlling file access, capabilities, and network access based on profiles.
# Apply an AppArmor profile to a container (in Kubernetes with containerd)
container.apparmor.security.alpha.kubernetes.io/runtimeclass: "runtime/default"
Docker applies a default seccomp profile that blocks about 44 system calls. Kubernetes does not apply any default seccomp profile, so you need to set it explicitly if you want it.
Supply Chain Security
The SolarWinds and Codecov breaches showed what happens when attackers compromise upstream supply chains. Your containers are only as secure as their dependencies.
flowchart LR
A[Image Build] --> B[Trivy Scan]
B --> C{ Vulnerabilities found? }
C -->|High/Critical| D[Block Deploy]
C -->|None/Low| E[Generate SBOM]
E --> F[Cosign Sign]
F --> G[Push to Registry]
G --> H[Kyverno Policy Check]
H --> I{ Signature Valid? }
I -->|No| J[Reject Pod]
I -->|Yes| K[Deploy to Cluster]
K --> L[Falco Runtime Monitor]
L --> M[Alert on Anomaly]
Pin base images to specific digests, not tags. Tags are mutable; a node:18-alpine today is not the same as node:18-alpine in six months.
# Pin to digest, not tag
FROM node@sha256:a1b2c3d4e5f6... as builder
Use image signing. Cosign (part of Sigstore) lets you sign images and verify signatures at runtime.
# Sign an image
cosign sign --key cosign.key myregistry/myapp:latest
# Verify in Kubernetes with Kyverno
kubectl apply -f - <<EOF
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-signed-images
spec:
validationFailureAction: enforce
match:
any:
- resources:
kinds:
- Pod
EOF
Production Failure Scenarios
| Failure | Impact | Mitigation |
|---|---|---|
| Trivy blocking deployment for critical CVE with no immediate patch | Build pipeline halts, deployment delayed | Establish a vulnerability exception process with risk acceptance sign-off, prioritize CVEs by exploitability (EPSS score) over severity alone |
| Falco false positives causing alert fatigue | Security team ignores alerts, real threats missed | Tune Falco rules to your environment, suppress known-false positives, review rule effectiveness quarterly |
| Container running as root escaping to host | Attacker gains host access, full cluster compromise | Enforce runAsNonRoot: true in PodSecurityPolicy, fail builds that produce root containers |
| Supply chain compromise via malicious base image | Backdoored image deployed to production | Pin base images to digests, use Cosign signature verification, scan all third-party images in CI |
:latest tag image mutation causing不一致 | Different nodes run different image versions, unpredictable behavior | Always tag builds with commit SHA, never pull :latest in production |
Container Security Trade-offs
| Scenario | Trivy | Grype | Notes |
|---|---|---|---|
| Vulnerability database size | Large | Large | Both cover major OS and language package feeds |
| SBOM generation | Via Syft | Native | Grype handles SBOMs directly; Trivy requires Syft as separate step |
| CI integration | Native | Native | Both exit non-zero on findings |
| JSON output for automation | Yes | Yes | Both produce structured output |
| Speed (large images) | Fast | Fast | Comparable performance |
| Scenario | Falco (runtime) | Prevention-only | Notes |
|---|---|---|---|
| Detects zero-days | Yes | No | Runtime monitoring catches novel attacks |
| Performance overhead | Low (~5%) | None | Falco adds minimal latency |
| Requires tuning | Yes | No | Falco needs rule customization per environment |
| Compliance value | Medium | Low | Falco provides audit trail for behavior |
| Scenario | Image signing required | Signing optional | Notes |
|---|---|---|---|
| Multi-team registry | Yes | No | Signature verification prevents unauthorized pushes |
| Single-person builds | No | Yes | Key management overhead exceeds risk without multiple contributors |
| Regulated environments | Yes | No | SOC 2, PCI-DSS often require artifact signing |
| Scenario | Rootless containers | Privileged containers | Notes |
|---|---|---|---|
| Security posture | Strong | Weak | Rootless significantly reduces container escape impact |
| Compatibility | Most apps work | Legacy apps may need root | Worth migrating legacy apps rather than running privileged |
| Performance | No overhead | No overhead | No reason to use privileged containers |
Container Security Observability
Monitor CVE counts per image as a metric in your CI pipeline. A spike in critical CVEs for an image you have not changed means one of your dependencies released a bad update. Set up alerts when image scan results change between builds.
Falco alert volume per rule tells you which rules are worth keeping. Rules that fire hundreds of times a day are noise. Suppress or remove them so real anomalies stand out.
Track container restart rates. Containers that restart every few minutes are either crashing or being evicted repeatedly. Both are worth investigating.
Key commands:
# Trivy scan with JSON output for metrics extraction
trivy image --exit-code 1 --severity HIGH,CRITICAL --format json myregistry/myapp:latest > scan-results.json
# Count CVEs by severity
jq '[.Results[].Vulnerabilities[]?.Severity] | group_by(.) | map({severity: .[0], count: length})' scan-results.json
# Falco alert volume by rule in the last hour
kubectl logs -l app=falco -n falco --since=1h | jq -r '.rule' | sort | uniq -c | sort -rn
# List images with most critical vulnerabilities across your cluster
kubectl get pods -A -o jsonpath='{range .items[*]}{.spec.containers[*].image}{"\n"}' | tr ' ' '\n' | sort -u | while read img; do echo "$img: $(trivy image --quiet --severity CRITICAL "$img" 2>/dev/null | grep -c CRITICAL || echo 0)"; done | sort -t: -k2 -rn | head -10
Common Anti-Patterns
Running containers as root. Many official images run as root internally. If your container escapes, the attacker has root on the host. Use runAsNonRoot: true and design your images with a non-root user from the start.
Not scanning images. Skipping scans to speed up builds means known vulnerabilities reach production. Block high and critical CVEs in CI. If the build cannot pass, that is the signal to fix the dependency.
Using the :latest tag. When you pull node:18-alpine, you get whatever node:18-alpine means today. Pin to digests: node:18-alpine@sha256:abc123.... Test your builds against a fixed version.
Not signing images. In any environment where untrusted parties can push to your registry, signature verification prevents unauthorized images from running. Cosign makes this straightforward.
Skipping runtime monitoring. Image scanning only catches known vulnerabilities. An attacker exploiting a misconfiguration or a zero-day will not show up in any scan. Falco closes that gap.
Quick Recap
Key Takeaways
- Image scanning catches known vulnerabilities; runtime monitoring catches anomalous behavior
- Pin base images to digests, not tags, to prevent supply chain drift
- Run containers as non-root with read-only filesystems to limit container escape blast radius
- Cosign signatures prevent unauthorized images from reaching your cluster
- Falco complements scanning by detecting post-deployment anomalies
Container Security Checklist
# 1. Scan every image in CI, block on HIGH/CRITICAL
trivy image --exit-code 1 --severity HIGH,CRITICAL myregistry/myapp:$GIT_COMMIT
# 2. Pin base images to digest
FROM node@sha256:abc123... AS builder
# 3. Build as non-root
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
# 4. Enforce read-only root filesystem
securityContext:
readOnlyRootFilesystem: true
# 5. Sign images with Cosign
cosign sign --key cosign.key myregistry/myapp:$GIT_COMMIT
# 6. Verify signatures in Kubernetes with Kyverno
kubectl apply -f kyverno-policy-require-signed-images.yaml
# 7. Deploy Falco as DaemonSet
helm install falco falcosecurity/falco -n falco --create-namespace
For more on securing Kubernetes workloads, see Network Security. For secrets handling, see Secrets Management.
Trade-off Summary
| Layer | Tool | Preventative vs Detective | CI/CD vs Runtime |
|---|---|---|---|
| Image scanning | Trivy, Grype, Snyk | Preventative | CI/CD |
| Sig verification | Cosign, Notary | Preventative | CI/CD + Registry |
| Runtime monitoring | Falco, Sysdig | Detective | Runtime |
| Policy enforcement | OPA Gatekeeper, Kyverno | Preventative | Admission control |
| User namespace remapping | —userns-remap | Preventative | Daemon config |
| Syscall filtering | seccomp, AppArmor, SELinux | Preventative | Daemon config |
| Network policies | K8s NetworkPolicy | Preventative | Runtime |
Interview Questions
Q: A container is running as root in a production pod. What are the risks and how do you fix it?
A: Running as root means if an attacker escapes the container, they have root access to the host. Risks include container breakout to host filesystem, binding to privileged ports, and capability escalation. Fix by: setting runAsNonRoot: true in pod security context, using a non-root user in the Dockerfile (USER instruction), and ensuring the image builds with a non-root user. Also set allowPrivilegeEscalation: false and drop all capabilities with capDrop: ALL.
Q: You discover a critical CVE in a base image used across 200 microservices. Walk through your response. A: First, stop the bleeding: block the vulnerable image version in your CI/CD admission control (OPA Gatekeeper or Kyverno). Identify all affected services via your image registry tags and deployment inventory. Prioritize by exposure (internet-facing vs internal) and data sensitivity. Build and push fixed images for the highest-priority services, test, and deploy. For lower-priority services, schedule into sprint planning. Set up automatic vulnerability scanning on new image pushes to catch this earlier. Consider a “golden image” strategy where security hardens a base image centrally.
Q: How do you prevent a compromised CI/CD pipeline from deploying malicious images? A: Use image signing and verification: sign images with Cosign or Notary during the build pipeline, then verify signatures at admission time using a policy controller (Kyverno or OPA Gatekeeper). Store signing keys in a KMS (AWS KMS, Google Cloud KMS, HashiCorp Vault). Enable admission control to reject unsigned images. Use short-lived tokens for CI/CD authentication rather than long-lived credentials. Audit all image pull events. Implement a software supply chain bill of materials (SBOM) to track what went into each image.
Q: What is the difference between seccomp, AppArmor, and SELinux in the context of container security?
A: Seccomp restricts syscalls a container can make at the kernel level — the most granular control but requires knowing which syscalls an application needs. AppArmor works at the application level, restricting capabilities and file access paths — easier to use for known application profiles. SELinux works at the system level, labeling files and processes — most powerful but complex to configure. In practice: Docker defaults ship with a sensible seccomp profile blocking dangerous syscalls. For Kubernetes, seccomp via securityContext.seccompProfile and AppArmor via container.apparmor.security.beta.kubernetes.io are the common paths. SELinux is typically used at the host level.
Q: How do you detect that a container has been compromised at runtime?
A: Runtime detection tools like Falco monitor syscall behavior and flag anomalous activity: a shell spawning inside a container, unexpected network connections, writing to sensitive paths like /etc/ or /root/. Sysdig captures system calls for deeper analysis. Network monitoring detects exfiltration attempts via unusual outbound traffic. Integrate these with your SIEM or alerting system. Also monitor container restart counts, unexpected process trees (kubectl top pods showing unusual CPU), and node-level indicators like new SSH keys in /root/.ssh/.
Conclusion
Container security is layers. Image scanning catches known vulnerabilities before deployment. Runtime monitoring catches behavior that does not match your expectations. Non-root users and restricted syscalls limit what a compromised container can do. Supply chain controls keep your dependencies honest.
None of these layers are sufficient alone. Combine them, automate them, and treat security as part of your build pipeline, not an afterthought.
Category
Related Posts
DevOps & Cloud Infrastructure Roadmap: From Containers to Cloud-Native Deployments
Master DevOps practices with this comprehensive learning path covering Docker, Kubernetes, CI/CD pipelines, infrastructure as code, and cloud-native deployment strategies.
Container Images: Building, Optimizing, and Distributing
Learn how Docker container images work, layer caching strategies, image optimization techniques, and how to publish your own images to container registries.
Container Registry: Image Storage, Scanning, and Distribution
Set up and secure container registries for storing, scanning, and distributing container images across your CI/CD pipeline and clusters.