Skip to content

Instantly share code, notes, and snippets.

@Salwan-Mohamed
Created October 20, 2025 11:11
Show Gist options
  • Save Salwan-Mohamed/e4ff8646b101837a42bd09cd0d62214b to your computer and use it in GitHub Desktop.
Save Salwan-Mohamed/e4ff8646b101837a42bd09cd0d62214b to your computer and use it in GitHub Desktop.
Part 5 – Kubernetes Day 1 Security Configuration

🔒 Part 5: Kubernetes Day 1 - Security Configuration: Building Fort Knox in the Cloud

When $2.3M disappeared because someone forgot to enable network policies. A real Day 1 security story.

The $2.3 Million Security Oversight

It was 2 AM when Sarah's phone exploded with alerts. As the security lead for a rapidly growing fintech startup, she'd seen her share of incidents. But nothing prepared her for this.

"We've been breached. They're draining customer accounts."

The post-mortem was brutal: A single misconfigured pod without network policies allowed lateral movement across the entire cluster. The attacker exploited a vulnerability in a third-party library, moved laterally to the database pods, and exfiltrated credentials. All because security was treated as a "Day 2 problem."

The lesson? Security isn't a feature you add later. It's the foundation you build on Day 1.

This is Part 5 of our Kubernetes Production Journey series, where we transform your cluster from a house of cards into Fort Knox. We'll cover everything from network segmentation to supply chain security, with battle-tested configurations from 50+ production deployments.

🎯 What You'll Learn

By the end of this article, you'll know how to:

  • Layer 1: Implement zero-trust network policies with Cilium
  • Layer 2: Enforce Pod Security Standards (restricted profile)
  • Layer 3: Automate policy enforcement with OPA/Kyverno
  • Layer 4: Manage secrets securely with HashiCorp Vault
  • Layer 5: Detect runtime threats with Falco
  • Layer 6: Implement least-privilege RBAC
  • Layer 7: Secure your supply chain with Sigstore
  • Layer 8: Maintain compliance with CIS benchmarks

All code examples are available in our GitHub repository.

🏗️ Security Architecture: Defense in Depth

Security in Kubernetes isn't a single checkbox - it's multiple layers that work together. Think of it like a medieval castle: walls, moats, guards, and inner keeps. If one layer fails, others protect you.

┌─────────────────────────────────────────┐
│   Layer 7: Supply Chain Security        │ ← Cosign, SBOM, Provenance
├─────────────────────────────────────────┤
│   Layer 6: Access Control (RBAC)        │ ← Who can do what
├─────────────────────────────────────────┤
│   Layer 5: Runtime Security             │ ← Falco threat detection
├─────────────────────────────────────────┤
│   Layer 4: Secrets Management           │ ← Vault, encryption
├─────────────────────────────────────────┤
│   Layer 3: Policy Enforcement           │ ← OPA/Kyverno
├─────────────────────────────────────────┤
│   Layer 2: Pod Security                 │ ← PSS, security contexts
├─────────────────────────────────────────┤
│   Layer 1: Network Security             │ ← Network policies, mTLS
└─────────────────────────────────────────┘

Let's build each layer, starting from the foundation.

🌐 Layer 1: Network Security - Zero Trust Networking

The Principle: Never trust, always verify. Every pod-to-pod communication must be explicitly allowed.

Why Network Policies Matter

In the breach story above, the attacker moved from the frontend pod → backend pod → database pod without any restriction. With proper network policies, that lateral movement would have been impossible.

Implementing Cilium Network Policies

Cilium provides powerful network policies with L7 (HTTP/gRPC) filtering, which standard Kubernetes NetworkPolicies can't do.

Example: Deny-All Default Policy

# File: 01-network-policies/cilium-policies.yaml
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: deny-all-default
  namespace: production
spec:
  endpointSelector: {}
  ingress: []
  egress: []

This creates a default-deny posture. Now we selectively allow traffic:

Allow Frontend → Backend (with L7 HTTP filtering)

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: frontend-to-backend
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: frontend
  egress:
  - toEndpoints:
    - matchLabels:
        app: backend
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP
      rules:
        http:
        - method: "GET|POST|PUT|DELETE"
          path: "/api/.*"

Key Features:

  • L7 filtering: Only specific HTTP methods and paths allowed
  • Explicit allow-listing: Frontend can only talk to backend on port 8080
  • API path restrictions: Blocks access to admin endpoints

Block Metadata Service Access (Critical for cloud security)

apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: block-metadata-service
spec:
  endpointSelector: {}
  egressDeny:
  - toCIDR:
    - 169.254.169.254/32  # AWS/Azure metadata
    - fd00:ec2::254/128   # AWS IPv6 metadata

This prevents pods from accessing cloud provider metadata services, which often contain sensitive credentials.

Istio mTLS for Service Mesh Security

If you're using Istio, enable mutual TLS for encrypted pod-to-pod communication:

# File: 01-network-policies/istio-mtls.yaml
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default-strict-mtls
  namespace: istio-system
spec:
  mtls:
    mode: STRICT

This encrypts all traffic in the mesh and verifies pod identities automatically.

View full network policies →

🛡️ Layer 2: Pod Security - Hardening Your Workloads

The Principle: Run containers with the least privileges necessary. No root, no privilege escalation, read-only filesystems.

Pod Security Standards (PSS)

Kubernetes 1.25+ replaces PodSecurityPolicies with Pod Security Standards. There are three profiles:

  • Privileged: Unrestricted (avoid in production)
  • Baseline: Minimal restrictions
  • Restricted: Hardened, best for production

Enforcing Restricted Profile

# File: 02-pod-security/pss-baseline.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Now any pod that violates the restricted profile will be rejected.

Production-Ready Secure Pod Template

Here's a golden template for secure pods:

# File: 02-pod-security/secure-pod-template.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-app
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: secure-app
  template:
    metadata:
      labels:
        app: secure-app
    spec:
      # Disable automounting service account tokens
      automountServiceAccountToken: false
      
      # Pod-level security context
      securityContext:
        runAsNonRoot: true
        runAsUser: 10001
        fsGroup: 10001
        seccompProfile:
          type: RuntimeDefault
      
      containers:
      - name: app
        image: myapp:v1.0.0
        imagePullPolicy: Always
        
        # Container-level security context
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 10001
          capabilities:
            drop:
            - ALL
        
        # Resource limits (prevent resource exhaustion attacks)
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
        
        # Liveness and readiness probes
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
        
        # Volume mounts for writable directories
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: cache
          mountPath: /app/cache
      
      volumes:
      - name: tmp
        emptyDir: {}
      - name: cache
        emptyDir: {}

Security Features Explained:

  1. No Root User: runAsNonRoot: true prevents running as root
  2. Read-Only Filesystem: Prevents attackers from modifying files
  3. Drop All Capabilities: Removes all Linux capabilities
  4. SecComp Profile: Restricts syscalls to prevent container escapes
  5. No Privilege Escalation: Prevents gaining higher privileges
  6. Resource Limits: Prevents resource exhaustion attacks

View pod security templates →

⚖️ Layer 3: Policy Enforcement - Automated Governance

The Principle: Don't rely on developers to remember security best practices. Enforce them automatically.

Kyverno: Kubernetes-Native Policy Engine

Kyverno is easier to learn than OPA and perfect for Kubernetes-specific policies.

Policy 1: Require All Images from Trusted Registry

# File: 03-policy-enforcement/kyverno-policies.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-image-registries
spec:
  validationFailureAction: enforce
  background: true
  rules:
  - name: validate-registries
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Images must come from approved registries: registry.company.com or ghcr.io"
      pattern:
        spec:
          containers:
          - image: "registry.company.com/* | ghcr.io/*"

Policy 2: Enforce Security Context

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-security-context
spec:
  validationFailureAction: enforce
  rules:
  - name: check-security-context
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "All containers must define securityContext with runAsNonRoot and readOnlyRootFilesystem"
      pattern:
        spec:
          containers:
          - securityContext:
              runAsNonRoot: true
              readOnlyRootFilesystem: true
              allowPrivilegeEscalation: false

Policy 3: Auto-Add Labels

Kyverno can also mutate resources to add missing configurations:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-default-labels
spec:
  rules:
  - name: add-labels
    match:
      any:
      - resources:
          kinds:
          - Deployment
          - StatefulSet
    mutate:
      patchStrategicMerge:
        metadata:
          labels:
            managed-by: kyverno
            security-reviewed: "true"

View Kyverno policies →

🔐 Layer 4: Secrets Management - Vault Integration

The Principle: Never store secrets in plain text. Use a centralized secrets manager with encryption, rotation, and audit logging.

Why Not Kubernetes Secrets?

Kubernetes Secrets are Base64-encoded, not encrypted. Anyone with etcd access can read them. For production, use HashiCorp Vault or cloud-native solutions (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager).

HashiCorp Vault Integration

Step 1: Deploy Vault with Helm

# Add Vault Helm repo
helm repo add hashicorp https://helm.releases.hashicorp.com
helm repo update

# Install Vault in HA mode
helm install vault hashicorp/vault \
  --namespace vault \
  --create-namespace \
  --set server.ha.enabled=true \
  --set server.ha.replicas=3 \
  --set injector.enabled=true

Step 2: Enable Kubernetes Auth

# Initialize Vault (save keys securely!)
kubectl exec vault-0 -n vault -- vault operator init

# Unseal Vault on all replicas
kubectl exec vault-0 -n vault -- vault operator unseal <key1>
kubectl exec vault-1 -n vault -- vault operator unseal <key1>
kubectl exec vault-2 -n vault -- vault operator unseal <key1>

# Enable Kubernetes auth
kubectl exec vault-0 -n vault -- vault auth enable kubernetes

# Configure Kubernetes auth
kubectl exec vault-0 -n vault -- vault write auth/kubernetes/config \
    kubernetes_host="https://$KUBERNETES_PORT_443_TCP_ADDR:443"

Step 3: Store Secrets in Vault

# Enable KV secrets engine
kubectl exec vault-0 -n vault -- vault secrets enable -path=secret kv-v2

# Store database credentials
kubectl exec vault-0 -n vault -- vault kv put secret/database/config \
    username="dbadmin" \
    password="super-secret-password"

Step 4: Inject Secrets into Pods

# File: 04-secrets-management/vault-injection.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-with-vault
  namespace: production
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/role: "myapp"
        vault.hashicorp.com/agent-inject-secret-database: "secret/data/database/config"
        vault.hashicorp.com/agent-inject-template-database: |
          {{- with secret "secret/data/database/config" -}}
          export DB_USERNAME="{{ .Data.data.username }}"
          export DB_PASSWORD="{{ .Data.data.password }}"
          {{- end }}
      labels:
        app: myapp
    spec:
      serviceAccountName: myapp
      containers:
      - name: app
        image: myapp:v1.0.0
        command: ["/bin/sh"]
        args: ["-c", "source /vault/secrets/database && ./app"]

What Happens:

  1. Vault Agent sidecar is automatically injected
  2. Agent authenticates with Kubernetes auth
  3. Secrets are fetched from Vault and written to /vault/secrets/database
  4. Application sources the file and uses environment variables

External Secrets Operator (ESO)

For cloud-native secrets (AWS Secrets Manager, Azure Key Vault), use External Secrets Operator:

# File: 04-secrets-management/external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: aws-secretsmanager
  namespace: production
spec:
  provider:
    aws:
      service: SecretsManager
      region: us-east-1
      auth:
        jwt:
          serviceAccountRef:
            name: external-secrets-sa
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
  namespace: production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secretsmanager
    kind: SecretStore
  target:
    name: database-secret
    creationPolicy: Owner
  data:
  - secretKey: username
    remoteRef:
      key: prod/database/credentials
      property: username
  - secretKey: password
    remoteRef:
      key: prod/database/credentials
      property: password

View secrets management configs →

🚨 Layer 5: Runtime Security - Threat Detection with Falco

The Principle: Detect and respond to suspicious behavior at runtime, not just at deploy time.

Installing Falco

# Add Falco Helm repo
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update

# Install Falco with Sidekick for alerts
helm install falco falcosecurity/falco \
  --namespace falco-system \
  --create-namespace \
  --set falcosidekick.enabled=true \
  --set falcosidekick.webui.enabled=true

Custom Falco Rules

# File: 05-runtime-security/falco-rules.yaml
- rule: Detect Shell in Container
  desc: Detect when a shell is spawned in a container
  condition: >
    spawned_process and
    container and
    proc.name in (shell_binaries)
  output: >
    Shell spawned in container (user=%user.name container=%container.name
    shell=%proc.name parent=%proc.pname cmdline=%proc.cmdline)
  priority: WARNING
  tags: [container, shell]

- rule: Write Below Binary Directory
  desc: Detect attempts to write to system binary directories
  condition: >
    open_write and
    container and
    fd.directory in (/bin, /usr/bin, /sbin, /usr/sbin)
  output: >
    File write below binary directory (user=%user.name command=%proc.cmdline
    file=%fd.name container=%container.name)
  priority: ERROR
  tags: [filesystem, container]

- rule: Outbound Connection to Known C2 Server
  desc: Detect connections to known command and control servers
  condition: >
    outbound and
    container and
    fd.sip in (malicious_ips)
  output: >
    Outbound connection to suspicious IP (user=%user.name container=%container.name
    ip=%fd.rip port=%fd.rport)
  priority: CRITICAL
  tags: [network, malware]

- rule: Read Sensitive File
  desc: Detect reading of sensitive files like /etc/shadow
  condition: >
    open_read and
    container and
    fd.name in (/etc/shadow, /etc/sudoers, /root/.ssh/id_rsa)
  output: >
    Sensitive file read (user=%user.name file=%fd.name container=%container.name
    command=%proc.cmdline)
  priority: CRITICAL
  tags: [filesystem, security]

Falco Alert Integration

Route Falco alerts to Slack, PagerDuty, or your SIEM:

# File: 05-runtime-security/falcosidekick-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: falcosidekick-config
  namespace: falco-system
data:
  config.yaml: |
    slack:
      webhookurl: "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
      minimumpriority: "warning"
      outputformat: "text"
    
    pagerduty:
      integrationkey: "YOUR_PAGERDUTY_KEY"
      minimumpriority: "error"
    
    elasticsearch:
      hostport: "https://elasticsearch:9200"
      index: "falco"
      minimumpriority: "debug"

View runtime security configs →

👤 Layer 6: Access Control - Least-Privilege RBAC

The Principle: Grant the minimum permissions necessary. No cluster-admin for applications.

RBAC Best Practices

  1. Never use cluster-admin for applications
  2. Use namespace-scoped Roles, not ClusterRoles
  3. Create service accounts per application
  4. Use Groups for human users
  5. Regularly audit permissions

Example: Developer Role

# File: 06-rbac/developer-role.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: team-alpha
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: developer-sa
  namespace: team-alpha
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: developer-role
  namespace: team-alpha
rules:
# Read-only access to most resources
- apiGroups: ["", "apps", "batch"]
  resources: ["pods", "deployments", "jobs", "services", "configmaps"]
  verbs: ["get", "list", "watch"]
# Can view logs
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get", "list"]
# Can exec for debugging (carefully consider this)
- apiGroups: [""]
  resources: ["pods/exec"]
  verbs: ["create"]
# Cannot modify critical resources
- apiGroups: [""]
  resources: ["secrets", "persistentvolumes"]
  verbs: []
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: developer-binding
  namespace: team-alpha
subjects:
- kind: ServiceAccount
  name: developer-sa
  namespace: team-alpha
- kind: Group
  name: developers
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: developer-role
  apiGroup: rbac.authorization.k8s.io

Example: CI/CD Pipeline Role

# File: 06-rbac/cicd-role.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cicd-deployer
  namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cicd-deployer-role
  namespace: production
rules:
# Can create and update deployments
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "create", "update", "patch"]
# Can create and update services
- apiGroups: [""]
  resources: ["services"]
  verbs: ["get", "list", "create", "update", "patch"]
# Can create and update configmaps (not secrets!)
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "list", "create", "update", "patch"]
# Can read secrets for validation only
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "list"]
# Can view pods and logs for debugging
- apiGroups: [""]
  resources: ["pods", "pods/log"]
  verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cicd-deployer-binding
  namespace: production
subjects:
- kind: ServiceAccount
  name: cicd-deployer
  namespace: production
roleRef:
  kind: Role
  name: cicd-deployer-role
  apiGroup: rbac.authorization.k8s.io

RBAC Audit Script

#!/bin/bash
# File: 06-rbac/audit-rbac.sh

echo "=== Cluster-Wide RBAC Audit ==="
echo ""

echo "1. Users/ServiceAccounts with cluster-admin:"
kubectl get clusterrolebindings -o json | \
  jq -r '.items[] | select(.roleRef.name=="cluster-admin") | 
  .metadata.name + " -> " + (.subjects[]?.name // "N/A")'

echo ""
echo "2. ServiceAccounts that can create pods (potential security risk):"
kubectl get rolebindings,clusterrolebindings -A -o json | \
  jq -r '.items[] | select(.roleRef.name | contains("admin") or contains("edit")) |
  .metadata.namespace + "/" + .metadata.name'

echo ""
echo "3. Overly permissive roles (wildcard permissions):"
kubectl get roles,clusterroles -A -o json | \
  jq -r '.items[] | select(.rules[]? | .verbs[]? == "*") |
  .metadata.namespace + "/" + .metadata.name'

View RBAC configurations →

📦 Layer 7: Supply Chain Security - Image Signing with Sigstore

The Principle: Trust but verify. Only deploy signed, verified container images.

Why Supply Chain Security Matters

Recent attacks (SolarWinds, Log4Shell) showed attackers can compromise the build pipeline. Image signing ensures images haven't been tampered with.

Installing Sigstore Cosign

# Install cosign
curl -O https://github.com/sigstore/cosign/releases/latest/download/cosign-linux-amd64
sudo mv cosign-linux-amd64 /usr/local/bin/cosign
sudo chmod +x /usr/local/bin/cosign

Signing Container Images

# Generate keypair
cosign generate-key-pair

# Sign image
cosign sign --key cosign.key registry.company.com/myapp:v1.0.0

# Verify signature
cosign verify --key cosign.pub registry.company.com/myapp:v1.0.0

Enforce Signed Images with Kyverno

# File: 07-supply-chain/verify-images-policy.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-image-signatures
spec:
  validationFailureAction: enforce
  background: false
  webhookTimeoutSeconds: 30
  rules:
  - name: verify-signature
    match:
      any:
      - resources:
          kinds:
          - Pod
    verifyImages:
    - imageReferences:
      - "registry.company.com/*"
      attestors:
      - count: 1
        entries:
        - keys:
            publicKeys: |-
              -----BEGIN PUBLIC KEY-----
              MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE...
              -----END PUBLIC KEY-----

Now any unsigned image will be rejected at admission time.

Generate SBOMs (Software Bill of Materials)

# Install Syft
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh

# Generate SBOM
syft registry.company.com/myapp:v1.0.0 -o json > sbom.json

# Attach SBOM to image
cosign attach sbom --sbom sbom.json registry.company.com/myapp:v1.0.0

Scan Images for Vulnerabilities

# Install Trivy
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh

# Scan image
trivy image --severity HIGH,CRITICAL registry.company.com/myapp:v1.0.0

# Fail CI/CD if vulnerabilities found
trivy image --exit-code 1 --severity CRITICAL registry.company.com/myapp:v1.0.0

View supply chain security configs →

📋 Layer 8: Compliance - CIS Kubernetes Benchmarks

The Principle: Meet compliance requirements from Day 1. Don't scramble during audits.

Running CIS Benchmarks with kube-bench

# Run kube-bench
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml

# View results
kubectl logs job/kube-bench

# Example output:
# [PASS] 1.2.1 Ensure that the --anonymous-auth argument is set to false
# [FAIL] 1.2.2 Ensure that the --basic-auth-file argument is not set
# [WARN] 1.2.3 Ensure that the --token-auth-file parameter is not set

Automated Compliance Scanning

# File: 08-compliance/kube-bench-cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: kube-bench-scan
  namespace: security
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          hostPID: true
          containers:
          - name: kube-bench
            image: aquasec/kube-bench:latest
            command: ["kube-bench", "run", "--targets", "node,policies"]
            volumeMounts:
            - name: var-lib-etcd
              mountPath: /var/lib/etcd
              readOnly: true
            - name: etc-kubernetes
              mountPath: /etc/kubernetes
              readOnly: true
          restartPolicy: Never
          volumes:
          - name: var-lib-etcd
            hostPath:
              path: /var/lib/etcd
          - name: etc-kubernetes
            hostPath:
              path: /etc/kubernetes

Key CIS Compliance Checks

Check Requirement Implementation
1.2.1 Disable anonymous auth --anonymous-auth=false in API server
1.2.6 Enable RBAC --authorization-mode=RBAC
3.2.1 Restrict network access Network policies
4.2.1 Run as non-root runAsNonRoot: true
5.1.5 Secure secret encryption Enable encryption at rest
5.2.2 Minimize privileges Least-privilege RBAC

Enable Encryption at Rest

# File: 08-compliance/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
      - secrets
    providers:
      - aescbc:
          keys:
            - name: key1
              secret: <BASE64_ENCODED_SECRET>
      - identity: {}

Then configure API server:

--encryption-provider-config=/etc/kubernetes/encryption-config.yaml

View compliance configurations →

🔄 Putting It All Together: Security Checklist

Use this checklist on Day 1 of every deployment:

Pre-Deployment

  • Review security architecture diagram
  • Ensure all team members complete security training
  • Prepare incident response runbook
  • Set up security monitoring dashboards

Network Layer

  • Deploy default-deny network policies
  • Configure allow-list rules for inter-service communication
  • Block cloud metadata service access
  • Enable service mesh mTLS (if using Istio/Linkerd)
  • Configure egress filtering

Pod Security

  • Enable Pod Security Standards (restricted profile)
  • Review and apply secure pod templates
  • Disable privilege escalation
  • Enable read-only root filesystem
  • Configure SecComp profiles
  • Set resource limits

Policy Enforcement

  • Install Kyverno/OPA
  • Deploy image registry restriction policies
  • Deploy security context enforcement policies
  • Configure automatic label/annotation injection
  • Set up policy violation alerts

Secrets Management

  • Deploy HashiCorp Vault or cloud secrets manager
  • Configure Kubernetes authentication
  • Migrate secrets from Kubernetes Secrets to Vault
  • Enable secret rotation
  • Configure audit logging

Runtime Security

  • Deploy Falco
  • Configure custom detection rules
  • Set up alert routing (Slack/PagerDuty)
  • Test alert notifications
  • Create incident response procedures

Access Control

  • Audit existing RBAC permissions
  • Remove cluster-admin bindings
  • Create namespace-scoped roles
  • Set up service accounts per application
  • Configure OIDC/LDAP integration for human users
  • Enable audit logging

Supply Chain

  • Set up image signing with Cosign
  • Generate and attach SBOMs
  • Configure Kyverno image verification
  • Integrate Trivy scanning in CI/CD
  • Create vulnerability management process

Compliance

  • Run kube-bench CIS scan
  • Address critical findings
  • Enable encryption at rest
  • Set up automated compliance scanning
  • Document compliance posture

🚀 Day 1 Security Deployment Timeline

Here's a realistic timeline for implementing all security layers:

Morning (9 AM - 12 PM): Foundation

  • 9:00 AM - Team briefing and security review
  • 9:30 AM - Deploy network policies (Layer 1)
  • 10:30 AM - Configure Pod Security Standards (Layer 2)
  • 11:30 AM - Deploy Kyverno policies (Layer 3)

Afternoon (1 PM - 5 PM): Advanced Security

  • 1:00 PM - Deploy and configure Vault (Layer 4)
  • 2:30 PM - Install Falco and configure rules (Layer 5)
  • 3:30 PM - Review and configure RBAC (Layer 6)
  • 4:30 PM - Set up image signing (Layer 7)

Evening (5 PM - 6 PM): Validation

  • 5:00 PM - Run CIS benchmarks (Layer 8)
  • 5:30 PM - Security testing and validation
  • 5:45 PM - Team handoff and documentation

🎓 Security Best Practices from 50+ Deployments

1. Security is Not Optional

Never treat security as "we'll add it later." The cost of retrofitting security is 10x higher than building it in from Day 1.

2. Defense in Depth Works

In every breach we've analyzed, multiple security controls failed. Layered defense saved the day.

3. Automate Everything

Manual security reviews don't scale. Use policy engines (Kyverno/OPA) to enforce standards automatically.

4. Monitor Runtime Behavior

Static scanning catches 70% of issues. Runtime monitoring (Falco) catches the remaining 30%.

5. Secrets Management is Critical

Never put secrets in git, environment variables, or Kubernetes Secrets (unencrypted). Use Vault.

6. Least Privilege Always

Start with zero permissions and add only what's needed. Never start with cluster-admin and try to remove permissions.

7. Supply Chain is Your Weakest Link

Most breaches happen through compromised dependencies. Sign images, generate SBOMs, and scan continuously.

8. Test Your Security

Run chaos engineering tests that simulate breaches. Ensure your defenses actually work.

🆘 Common Security Pitfalls

Pitfall 1: "We'll Add Network Policies Later"

Impact: Lateral movement attacks, $2.3M breach (real story) Fix: Deploy default-deny policies on Day 1

Pitfall 2: Running Containers as Root

Impact: Container escape vulnerabilities Fix: Use runAsNonRoot: true and user ID > 10000

Pitfall 3: Using Kubernetes Secrets Without Encryption

Impact: Secrets exposed via etcd access Fix: Use Vault or enable encryption at rest

Pitfall 4: No Runtime Security Monitoring

Impact: Breaches detected weeks/months later Fix: Deploy Falco for real-time threat detection

Pitfall 5: Overly Permissive RBAC

Impact: Privilege escalation attacks Fix: Audit RBAC regularly, use namespace-scoped roles

Pitfall 6: Unsigned Container Images

Impact: Supply chain attacks, malicious images Fix: Implement Cosign signing and Kyverno verification

Pitfall 7: No Security Policy Enforcement

Impact: Inconsistent security across teams Fix: Deploy Kyverno/OPA with enforce mode

Pitfall 8: Missing Compliance Documentation

Impact: Failed audits, regulatory fines Fix: Run kube-bench regularly, maintain documentation

📊 Security Metrics to Track

Monitor these metrics to maintain security posture:

Reactive Metrics (What Happened)

  • Security Policy Violations: Track Kyverno/OPA denials
  • Falco Alerts: Monitor suspicious runtime behavior
  • Vulnerability Count: Track CVEs in images (Critical/High)
  • RBAC Permission Changes: Audit trail of access changes
  • Failed Authentication Attempts: Detect brute force attacks

Proactive Metrics (Prevention)

  • Pod Security Standard Compliance: % of pods meeting restricted profile
  • Image Signature Coverage: % of images signed
  • Secret Rotation Rate: How often secrets are rotated
  • CIS Benchmark Score: Track compliance improvements
  • Network Policy Coverage: % of pods with network policies

Example Prometheus Queries

# Policy violations per hour
rate(kyverno_policy_results_total{status="fail"}[1h])

# Falco alerts by severity
falco_alerts_total{priority="critical"}

# Unsigned images deployed
sum(kyverno_policy_results_total{policy_name="verify-image-signatures",status="fail"})

# Pods running as root
count(kube_pod_container_status_running{security_context_run_as_non_root="false"})

🔗 Additional Resources

Official Documentation

Tools

Training & Certification

  • KCSA (Kubernetes and Cloud Native Security Associate)
  • CKS (Certified Kubernetes Security Specialist)
  • CNCF Security TAG

🎯 What's Next?

This completes our security foundation. In the next articles, we'll cover:

  • Part 6: Monitoring & Observability - Prometheus, Grafana, and the three pillars
  • Part 7: Disaster Recovery - Backup strategies and DR testing
  • Part 8: Cost Optimization - FinOps practices for sustainable cloud economics

All code examples are available in the GitHub repository.


💬 Your Security Story

What security incident taught you the importance of Day 1 security? Share your story in the comments below.

Found this helpful? ⭐ Star the GitHub repo and share with your team!

Questions? Open an issue or start a discussion on GitHub.


This is Part 5 of the Kubernetes Production Journey series. Read Part 1: Infrastructure Provisioning to start from the beginning.

About the Author: Platform engineer with 8+ years building production Kubernetes platforms across fintech, healthcare, and e-commerce. Passionate about security, automation, and sharing real-world lessons.


🔐 Remember: Security is not a destination, it's a continuous journey. Start with these foundations on Day 1, and iterate as threats evolve.

Stay secure! 🛡️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment