Lateral movement risks in the cloud and how to prevent them – Part 1: the network layer (VPC)
In this first blog post, we will introduce lateral movement as it pertains to the VPC. We will discuss attacker TTPs, and outline best practices for security practitioners and cloud builders to help secure their cloud environment and reduce risk.
In this first blog post in a series covering lateral movement in the cloud, we will introduce lateral movement as it pertains to the cloud’s network layer, the virtual private cloud (VPC). We will discuss attacker tactics, techniques, and procedures (TTPs), and outline best practices for security practitioners and cloud builders to help secure their cloud environment and reduce the risk of lateral movement in the VPC and between VPCs. This is especially important given 58% of cloud environments have at least one publicly exposed workload with a cleartext long-term cloud key stored in it.
Intro to cloud lateral movement
Lateral movement is a tactic used by adversaries to expand their network access in order to move through an environment and achieve their goals (e.g. exfiltrate sensitive data, commandeer workloads).
For years, lateral movement has been used to target on-prem networks based on network protocols and services such as Active Directory, SMB, and NTLM. From the Stuxnet worm that propagated through network shares, to advanced threat groups like APT1 and APT29 performing pass the hash and pass the ticket, lateral movement has been involved in numerous major attacks.
In a cloud environment, when attackers gain initial access and compromise a workload, they can abuse IAM permissions or ‘hop’ from one workload to another within the virtual private cloud (VPC) using well-known on-prem lateral movement techniques. Their goal is to reach highly valuable assets that can provide additional lateral movement and access to other cloud resources and identities (e.g. crown-jewel resources with sensitive data or admin identities), either inside the same VPC or outside of it.
Background: on-prem vs. cloud
Before diving into different lateral movement techniques in the cloud network layer, let’s first go over some important differences between on-prem and cloud environments:
Identity and Access Management (IAM)
Although many lateral movement techniques are applicable to both on-prem and cloud networks, IAM is a significant differentiator in the cloud: IAM governs access controls and authorizes identities to perform certain actions on specific resources. An attacker that has compromised one of these identities might be able to impersonate it, execute actions–depending on its effective permissions–and move laterally to different cloud resources in the account via cloud provider API commands.
Unlike on-prem, which demands extensive networking knowledge and entails hardware limitations and slow procurement processes, deploying and configuring VPCs and network resources (e.g. Internet Gateways, Load Balancers, ACLs) in the cloud is simple and straightforward. However, such execution speed increases the risk of exploitable network misconfigurations and resource compromise.
Visibility and risk management
The cloud’s complex architecture makes it challenging to track and secure thousands of resources, much less map connections between them, gauge effective permissions, and analyze and prioritize critical threats to organizations. In order to tackle this problem, all major CSPs support dedicated APIs that provide visibility into the resources deployed in cloud environments. Although useful to cloud administrators, such capabilities can be abused by malicious actors to determine the types of resources running in a compromised account that have crown-jewel potential.
Network lateral movement tactics, techniques, and procedures
Adversaries in the cloud leverage several techniques and functionalities to conduct lateral movement attacks. These include remote services, worms, valid accounts, VPC peering, IaaS/PaaS databases, vulnerabilities, and misconfigurations.
Access hosts via remote services– Malicious actors in a VPC can use stored cleartext private keys or credentials in compromised VMs to move laterally to machines accepting remote connections like SSH and RDP. They can also scan for exploitable vulnerable remote services once inside the VPC.
Plant worms– Adversaries often use worms to infect workloads and then scan for other workloads in the VPC that have exploitable vulnerabilities and security misconfigurations. For example, a Linux machine with unrestricted security-group rules and weak authentication methods is an easy target since the worm can scan it and crack the local user’s password through brute force. A good illustration is the DreamBus botnet.
Impersonate valid accounts– Adversaries can abuse cleartext cloud keys of existing accounts, and with the correct permissions, impersonate an IAM identity to compromise other cloud resources through the IAM layer. This can occur outside of the original VPC (e.g. S3 buckets) via cloud provider API commands. A compromised admin identity–or one that can escalate to such privileges–can result in a complete account takeover.
Move through VPC peering– Like site-to-site VPN, VPC peering enables communication between two isolated environments. It is supported by all major CSPs (AWS, Azure, GCP). If an attacker enters a VPC that is peered with another that grants it unrestricted network access, the attacker can ‘escape’ the first network, move laterally to other workloads in the second, and potentially compromise resources across subscriptions or even tenants.
Discover IaaS/PaaS databases– Cleartext private keys and credentials can grant adversaries access to IaaS or PaaS databases (e.g. RDS instances) residing in a compromised VPC, regardless of their public exposure. These types of databases may be crown jewels and contain highly sensitive data such as credentials or customer PII.
Exploit vulnerabilities and misconfigurations– When hunting for the valuable assets crucial to lateral movement, adversaries typically seek out the ‘low-hanging fruit’ located in the compromised VPCs. The ideal targets are exploitable workloads with vulnerabilities and security misconfigurations, such as network-reachable, internal VMs with critical RCE vulnerabilities and no strict security-group rules.
Most cloud environments are susceptible to lateral movement
Wiz Research investigated the number of cloud environments that possess at least one lateral movement path involving a publicly exposed workload within a VPC that has either a cleartext private SSH key or a cleartext long-term cloud key (e.g. AWS access key).
Our findings show that approximately 58% of cloud environments have at least one publicly exposed workload with a cleartext long-term cloud key stored in it, whereas about 35% of cloud environments feature at least one publicly exposed workload with a cleartext private SSH key.
In either case, such compromise enables adversaries to escalate their privileges within the environment in question or connect to other workloads in the VPC.
As can be seen, these numbers reflect the lack of adequate defenses against lateral movement attacks in many organizations’ cloud environments.
Recommended best practices
Here are 5 key network best practices that any organization should implement in its cloud environment to mitigate the risk of a lateral movement attack:
1. Implement strict firewalls (security groups and ACLs)
Whereas security groups act as firewalls for inbound/outbound traffic to/from VM instances within the VPC, ACLs function as firewalls at the subnet level. The best policy for all security groups and ACLs is to apply the principle of ‘least privilege’ to all rules: limiting access to specific IP addresses reduces the attack surface in the event of workload compromise. For example, a strictly configured security group can prevent an attacker from moving laterally to a non-exposed VM with a RCE vulnerability on a specific port by blocking network connectivity.
2. Remove cleartext cloud and private keys
Cleartext long-term cloud keys should not be stored inside your cloud workloads. Compromised keys enable adversaries to ‘escape’ the network layer and move laterally between cloud services and resources, thereby maintaining persistence. Instead, ensure that only roles with least-privileged permissions are attached to EC2 instances (strict RBAC roles in Azure). These roles generate temporary credentials automatically, which eliminate the risk of long-term-key exposure and potential persistence in your environment.
As for private SSH keys, organizations can adopt more secure methods to remotely authenticate to internal machines. For instance, they can use Bastion hosts to prevent port exposure, or utilize dedicated cloud provider services based on IAM permissions like AWS’s SSM API or GCP’s Identity-Aware Proxy (IAP). In the case of Linux machines, these dedicated services would be safer options than password authentication.
3. Remediate critical vulnerabilities
Once adversaries have successfully compromised a workload in a VPC, they will start scanning for other workloads residing in it with exploitable, critical vulnerabilities. Therefore, any critical vulnerabilities on any workload in your VPCs, both internet-exposed and non-exposed, should be remediated immediately.
4. Isolate your environment
Splitting up your environment into different VPCs based on their functionality (e.g. production) or group (e.g. Finance) strengthens your security posture. It reduces your attack surface and mitigates lateral movement risk by both enhancing visibility into your resources and minimizing the blast radius in the event of a security breach.
5. Adopt private link
As opposed to VPC peering, which provides broad bidirectional access across two different VPCs, private link is a more restricted, unidirectional mechanism. Private link allows a resource to expose an endpoint service to any chosen subscription in order to directly connect VPCs. It is offered by all major CSPs (AWS PrivateLink, GCP Private Service Connect, Azure Private Link).
Summary
Developing a firm understanding of various attack TTPs is crucial to anyone working in cybersecurity. In this first blog post, we introduced the concept of lateral movement in cloud environments, focusing mainly on the VPC network layer. We presented the main differences between on-prem and cloud environments, outlined typical lateral movement techniques in the VPC, and highlighted 5 network best practices to reduce its attack surface. Moreover, we shared our team’s research findings.
In the next post in this series, we will examine lateral movement from Kubernetes clusters to the cloud. We will explain some common attacker TTPs, and list key best practices to strengthen organizations’ environments and minimize the blast radius of potential breaches.
This blog post was written by Wiz Research, as part of our ongoing mission to analyze threats to the cloud, build mechanisms that prevent and detect them, and fortify cloud security strategies.
Before it was patched, #AttachMe could have allowed attackers to access and modify any other users' OCI storage volumes without authorization, thereby violating cloud isolation. Upon disclosure, the vulnerability was fixed within hours by Oracle. No customer action was required.
Leonid Belkind, CTO of Torq, and Itay Arbel, PM at Wiz, explain how organizations can build a coherent Cyber Security Incident Response Plan using Wiz CDR to analyze cloud events and threat alerts in their context together with Torq's next-generation orchestration and automation capabilities.
Get a personalized demo
Ready to see Wiz in action?
“Best User Experience I have ever seen, provides full visibility to cloud workloads.”
David EstlickCISO
“Wiz provides a single pane of glass to see what is going on in our cloud environments.”
Adam FletcherChief Security Officer
“We know that if Wiz identifies something as critical, it actually is.”
Greg PoniatowskiHead of Threat and Vulnerability Management