Defeating Kubernetes Privilege Escalation: A Cloud Detection & Response Case Study

This case study serves to highlight the importance of rapid, heuristic, accurate, and contextualized detection and response in the cloud.

3 minutes read

Attackers are constantly searching for new ways to target cloud environments and escalate initial access into full administrative privileges.

In recent months, our research team has observed a rise in attempts to escalate privileges from access to Kubernetes clusters to cloud control planes, creating significant risks for many organizations. As containers often lack the visibility and prevention controls commonly applied to traditional compute resources, this expansion of the cloud attack surface can be especially challenging. 

A recent attack defeated by the research team highlights the importance of across-the-cloud heuristic detections and immediate response capabilities to combat these rising threats. To protect the victim organization’s identity, certain details of the attack have been modified, however every stage of the presented case study was performed by real attackers and responders. 

Threat Detection

The investigation was triggered by two specific attacker TTPs detected in a sensitive production AWS environment: 

  1. An EC2 instance IAM role used from an irregular source IP address in AWS.

  2. Apparent reconnaissance actions being performed by the same role querying multiple EC2 instances and Security Groups.

While each of these actions may not have been classified as highly suspicious on its own, heuristic analysis based on environmental inventory baselines  identified the role as one belonging to an EC2 instance running an EKS pod. This fact was key to the attack being detected early – actions which may be common or legitimate when performed by other roles in the environment were correctly identified as suspicious behavior for this specific IAM role. 

With this initial activity identified, defenders turned to a quick triage process and determined there was no legitimate reason for this role to be performing the detected actions. As the team jumped into a full investigation, the key questions became whether and how this role could have been compromised. Without a quick and decisive resolution to this mystery, the team faced a familiar dilemma in incident response: do we respond and contain the suspicious activity before knowing all the facts?

Responding quickly has the obvious benefit of potentially preventing further damage, however doing so without context often leads to ineffective containment measures which only serve to let attackers know they’re being investigated and speed up their attack. This is where rapid contextualized investigation became crucial. 

Investigation and Response

Pulling the thread of initial suspicious events, the team was able to leverage context from CloudTrail logs, VPC Flow logs, and forensic artifacts from the EC2 instance, to quickly put the pieces of the puzzle together. Due to a default configuration left in place, EKS pods were allowed to connect to the Instance Metadata Service (IMDS) on their host EC2 instance. Shortly before reconnaissance activity began, leveraging the compromised EC2 instance role, local OS logs showed the Instance Metadata Service being accessed from the instance – enabling the attacker accessing it to escalate machine access to control of the IAM role. 

Forensic and log evidence further revealed that the EC2 instance was running an open-source application containing a recently published remote code execution CVE. This CVE was successfully exploited by a presumed scan just hours before the EC2 instance IAM role was compromised.

As it happened, this open-source application was specifically run by the DevOps team behind a Security Group disallowing any access from the internet. However, this changed on the day of the attack. An internet facing service was unfortunately deployed to the same subnet as the vulnerable open-source application, leading to the modification of the same Security Group controlling access to both machines.

Conclusion

This case study serves to highlight the importance of rapid, heuristic, accurate, and contextualized detection and response in the cloud. In addition to the obvious takeaways of implementing effective vulnerability management and segregating production applications, we must accept the fact that mistakes and misconfigurations can still happen.

The risk is especially high from attacks leveraging newly released CVEs or sparsely monitored applications, such as this attack targeting Kubernetes. The ability to centrally detect and rapidly triage complex anomalous events in the cloud is therefore not only a “nice to have”, but a vital requirement of a successful cloud security strategy

Continue reading

Get a personalized demo

Ready to see Wiz in action?

“Best User Experience I have ever seen, provides full visibility to cloud workloads.”
David EstlickCISO
“Wiz provides a single pane of glass to see what is going on in our cloud environments.”
Adam FletcherChief Security Officer
“We know that if Wiz identifies something as critical, it actually is.”
Greg PoniatowskiHead of Threat and Vulnerability Management