May 12, 2026

Incident Response in Kubernetes (EKS)

The pager goes off at 3:00 AM.

Your security dashboard is flashing red with an alert inside a production cluster. In the old days of physical servers, you could simply pull the network cable. But this is Kubernetes. If you simply kill the compromised pod, the scheduler will faithfully recreate it elsewhere, potentially bringing the attacker along for the ride.
Welcome to the first part of our series on incident response in managed Kubernetes clusters. Over the next three posts, we’ll be exploring how to handle security breaches across the big three: EKS (AWS), GKE (Google Cloud), and AKS (Azure), starting with EKS.

tl;dr

This guide covers how logging is handled in EKS and advises turning on logging before an incident occurs. Since EKS containers are non-persistent, any data not forwarded to CloudWatch (or another log analysis platform) is lost forever during a restart.

If you suspect an active compromise in your Kubernetes cluster, every second counts. Don't risk destroying volatile evidence.

Contact Invictus for 24/7 Emergency IR Support

Understanding EKS

Before we can investigate Amazon Elastic Kubernetes Service (EKS), we have to understand how it actually operates. Kubernetes clusters provide a notorious level of management overhead. To solve this, Amazon provides EKS, where Amazon is responsible for hosting and managing the cluster, offloading the complexity from the operator.
The cluster consists of two main parts:

  • The Control Plane: This is managed by AWS. It includes the Kubernetes API server, the etcd database (where all your cluster configs live), and the controllers.
  • The Data Plane: These are the worker nodes where your actual code runs.

An analogy for this is like renting an apartment. The landlord is responsible for the state of the house, the plumbing etc., so you as the renter can be busy decorating the place. However, if a guest starts a fire, the landlord isn't going to put it out for you, but they might have CCTV footage of who entered the building.

Figure 1: Shared Responsibility Model for EKS

Investigating EKS

Within EKS, there are several log sources which differ in purpose for a security investigation, as shown in table 1.

Table 1 - EKS Logging
Log source Purpose Relevance to IR
Kubernetes API Server logs Records all network calls to the Kubernetes API server. Logs e.g. kubectl interactions.
Audit logs Provides a detailed record of individual events, proving who did what. Used to track activity in the cluster, providing which actor from where did what.
Authenticator logs Tracks RBAC and IAM logins. It tells you how an AWS IAM entity was mapped to a Kubernetes User/Group. Essential for finding leaked AWS keys being used to access Kubernetes.
Controller manager logs Monitors the state of the cluster and logs changes made, such as a namespace being deleted. It tracks the creation of malicious resources, such as coinmining containers or a new backdoor resource.
Scheduler logs Decides which worker node should host which pod based on resources and constraints. It shows if an attacker is trying to hide pods on specific nodes or bypass security constraints (like Taints/Tolerations) to gain a foothold.
Application logs Logs outputted by code running on the containers. Shows malicious requests in (network) access logs.

These logs are supported by several non-EKS specific log sources, as depicted in table 2.

Table 2 - AWS Log sources related to EKS
Log source Purpose Relevance to IR
AWS CloudTrail Records every API call made for an AWS account. Detects lateral movement from the cluster to resources via the cluster's IAM role.
VPC Flow logs Captures metadata about IP traffic entering and leaving your VPC interfaces. Shows in- and egress traffic to the pods, used to uncover C2 traffic.
AWS GuardDuty Provides a threat detection service that continuously monitors for unauthorized access, malware and the like. GuardDuty alerts may provide more information on detected abuse.

Log collection

By default, if a pod is compromised and then crashes or is deleted, its local logs vanish. This is why shipping logs to a centralized location (like CloudWatch Logs or an S3 bucket) is an advised step. If you aren't shipping them, the attacker can cover their tracks just by crashing the container. The AWS CLI can be used to enable all logging as mentioned in table 1 via:

aws eks update-cluster-config \
    --region region-code \
    --name my-cluster \
    --logging '{"clusterLogging":[{"types":["api","audit","authenticator"
      ,"controllerManager","scheduler"],"enabled":true}]}'

In order to send application logs to a centralized environment (such as CloudWatch), a separate agent is required to forward these, such as Fluent Bit. Furthermore, if kubectl is configured correctly, it can be used to retrieve the logs as well. The limiting factor here is that kubectl can only interact with the dataplane of the cluster and thus only retrieve a handful of logs, such as the pods' stdout logging. Control plane logging (such as the Audit Logs) are handled by AWS and confined as such).

Pod Forensics

While this guide focuses on the logs from EKS itself, sometimes the investigation requires you to get your hands dirty on the worker node itself. If an attacker has deployed fileless malware, logs alone won't tell the full story. For a deep dive into capturing memory dumps, analyzing container runtimes (containerd/CRI-O), and inspecting the /var/log/pods directory on the physical host, we recommend this SANS guide on Forensics in EKS.

Mapping to Kubernetes TTPs

Microsoft released the threat matrix for Kubernetes, providing a way to map attacker behaviour (TTPs) to goals (see figure 2).

Figure 2: Kubernetes matrix, source

We can use this matrix to see where behaviour would be detected, see table 3. Do note that this table is non-exhaustive, as certain tactics can be found in multiple log sources, depending on the technique used.

Table 3 - K8s threat mapping to logs
Tactic Log Source Description
Execution Application Logs Shows the method used, such as exploitation of the application hosted on EKS.
Persistence Audit Logs Tracks the creation of rogue resources. While the Controller Manager logs the action of spawning them, the Audit Log identifies who executed the command.
Privilege Escalation Audit Logs Monitor for create or patch events on RoleBindings or ClusterRoleBindings.
Credential Access Authenticator Logs Use it to see when a pod’s Service Account assumes an unexpected AWS IAM Role.
Discovery Audit Logs High-frequency list and get verbs for secrets, pods, and nodes.
Lateral Movement VPC Flow Logs Detect requests towards the identity service.
Collection CloudTrail Find unauthorized image pulls from ECR.
Impact GuardDuty Alerts on unusual spike in CPU utilization and outbound traffic to known mining pools.

Containment & Eradication

Once the compromise has been detected, it is important to eradicate the actor as soon as possible to perform some damage control. A possibility is to power off the compromised resource. This however comes with the caveat that non-persistent data is lost, such as temporary files and fileless malware. Hence, it is recommended to let the resource live and rather quarantine it:

  • Apply a "Deny-All" NetworkPolicy to the specific pod. This cuts the attacker's connection to their C2 server but keeps the pod alive for you to investigate.
  • Remove the pod's labels so the Load Balancer stops sending it production traffic.

Ensure that compromised authentication data (such as access keys) are rotated and vulnerabilities are patched. Remember: Treat your containers like cattle, not pets. Don't try to clean a compromised container. Patch the vulnerability in your code, build a new image, and deploy a fresh, clean version of the app.

Investigating in CloudWatch

Having the logs in CloudWatch is one thing, finding the right events is another. We have thus added a few sample queries for possible abuse.

Common attack scenarios

Investigate kube exec abuse:

The kubectl tool can be used to directly execute commands within a pod, essentially granting a shell. Kubectl talks to the Kubernetes API server. Thus, execution can be found via filtering on Kubernetes API requests where the requestURI contains the ‘exec’ keyword, as such:

fields @timestamp, requestURI, userAgent, sourceIPs.0
| filter @logStream like /kube-apiserver-audit/
| filter requestURI like "/exec"
| sort @timestamp desc

@timestamp requestURI userAgent sourceIPs.0
2026-05-01T12:08:09.129Z /api/v1/namespaces/default/pods/security-test-app-69cb774fb5-jdwv2/exec?command=%2Fbin%2Fbash&container=nginx&stdin=true&stdout=true&tty=true kubectl.exe/v1.34.1 (windows/amd64) kubernetes/93248f9 203.0.113.1

While activity may appear suspicious, it can be a legitimate administrator performing routine tasks. Correlate the user and the source IP address to determine whether this concerns malicious behaviour.
Investigate forbidden secrets listing:

Furthermore, kubectl can be used to retrieve secrets from the cluster. The behaviour is quite similar to attempting to exec into a pod via the tool, so the query to detect it is also quite similar:

fields user.username, responseStatus.code, operation, requestURI
| filter @logStream like /kube-apiserver-audit/
| filter requestURI like "/secrets"
| filter responseStatus.code = 403
| stats count(*) by user.username, requestURI

user.username requestURI count(*)
arn:aws:iam::12345:role/dev-role /api/v1/namespaces/default/secrets?limit=500 1

Real-World Context

In mid 2025, the actor TraderTraitor, which is a North Korean state-sponsored threat group, was seen attacking a cryptocurrency exchange, where a Kubernetes cluster was attacked and used as a pivot to other services [source]. Initial access was achieved to the cluster via phishing, granting the actor the ability to deploy a malicious pod, which was designed to expose the mounted service account token. Via the privileged service account, the actor could authenticate to the Kubernetes API, perform discovery and create a backdoor into a production pod to maintain persistent access within the cluster. This furthermore granted the ability to move laterally to other cloud services, after which they could achieve their final goal by reaching the financial systems.

Conclusion

Incident Response in EKS is a shared responsibility with Amazon. As we saw in the real-world case, Kubernetes can also serve as a pivot to other internal (cloud) systems, thus the need to appropriately harden the environment. Since containers within Kubernetes are ephemeral, ensure that logging is enabled beforehand and is accessible for your IR team. If an incident occurs, quarantine the infected resources and approach it as cattle not as pets by replacing the infected resources.


Coming up next in Part 2: We’re crossing the fence over to Google Cloud to look at GKE. We’ll see how Google handles logging differently and how it compares to AWS.

Don’t wait for the 3:00 AM pager to find out your pod logs vanished.

The best time to configure your EKS logging is yesterday; the second best time is today. Whether you need help auditing your cluster configurations, setting up CloudWatch, or running table-top IR exercises, our cloud experts are ready.

Assess Your K8s Readiness with Invictus →

About Invictus Incident Response

We are an incident response company and we ❤️ the cloud. We help our clients stay undefeated.

🆘 Incident Response support: reach out to cert@invictus-ir.com or go to https://www.invictus-ir.com/24-7

Be ready for the next cloud incident.

Invictus Schield