How to get rid of AWS access keys – Part 3: Replacing the authentication
In the earlier posts in this series, we showed not only how to get rid of unused access keys, but also how to minimize risk by applying a least-privilege strategy. In this final post, we’ll at last get into the discussion of alternative solutions to using access keys.
When a compute resource in another cloud provider (e.g. GCP or Azure) needs to interact with your AWS resources, OpenID Connect (OIDC) may be used. With this solution, you configure an IAM role in your environment to trust the other cloud provider, which knows the compute resources in its own environment and can then give them the necessary access. Several other solutions also support OIDC use cases like this, such as Github Actions. Instructions on setting up this integration can be found in AWS and Github’s documentation. The resulting IAM role trust policy looks like this:
When setting up these integrations, it is crucial that you ensure other users of those third-party services cannot also assume your role. For example, if you accidentally leave off the “sub” condition that specifies the “GitHubOrg/GitHubRepo”, then anyone with a Github account could create a Github Action workflow that assumes your role! Various modifications to that condition could allow any branch in a repository or any repository in your organization to assume the role, potentially presenting a security risk.
On-prem workloads
Finally, we can discuss situations in which trust cannot be obtained via AWS’s cloud or another cloud. In the event you would like to grant a server in a datacenter, a laptop, or another physical device access to your AWS environment, the two primary solutions offered by AWS are IAM Roles Anywhere and Systems Manager hybrid activation.
IAM Roles Anywhere
In many ways, IAM Roles Anywhere is simply replacing an access key and secret key with a certificate. AWS relies on customers to ensure they are managing a Public Key Infrastructure (PKI) that can better handle certificates than access keys.
The utility that AWS provides for Roles Anywhere, rolesanywhere-credential-helper, is limited to only being able to access the certificate from a file on disk. The danger of storing the certificate on disk is that if an attacker can access the file, such as through some sort of arbitrary file read operation or finding this file in source control or backup storage, the certificate can be used from anywhere, much like an access key. Although you can apply conditions to the Role with which they are associated (e.g. restricting it to a source IP address), those same conditions can also be applied to IAM Users and their access keys.
This service is mostly a re-imagining of iot:AssumeRoleWithCertificate. Because it relies on certificates, the certificates could in theory be stored inside something like a TPM to perform signing operations without being able to be extracted. A researcher named Aidan Steele created an open-source tool called cloudkey that accomplishes this with a yubikey and the aforementioned iot:AssumeRoleWithCertificate call without requiring a human to press the yubikey. This concept could be adopted for IAM Roles Anywhere and other secure hardware devices, such as a TPM. There is even an open PR for PKCS11 support to accomplish this which AWS should review. It’s been almost a year since this service and tooling was released, and this lack of support has been noted.
Systems Manager hybrid activation
Even though Systems Manager hybrid activation hasn’t received as much fanfare as IAM Roles Anywhere, it is a technology that is worthy of consideration and is used behind the scenes for ECS Anywhere and is part of IoT Greengrass. The key components of this mechanism are an activation code and the use of device fingerprinting. The activation code is provided to the remote system and by default expires after one day and by default can only be used once. This mitigates the risks other solutions have where the thing transferred to the client (which could accidentally appear in logs or other data sources that an attacker might come across later) can be directly used by the attacker to authenticate.
The remote system generates its own key pair and a fingerprint of itself based on information about the BIOS, processor, memory, disk information, IP address, and more. This fingerprint is hashed and passed to AWS, along with the activation code and public key, to register itself via the undocumented API ssm:RegisterManagedInstance. In order to obtain credentials, the fingerprint is then used with the undocumented API ssm:RequestManagedInstanceRoleToken. These APIs are best documented by Aidan Steele in a different repo of his, awsaccountcreds, for an entirely different use case of SSM (Default Host Management Configuration).
This may sound like a large improvement because the remote system is generating a fingerprint of itself. However, this fingerprint data is cached on disk, and when combined with the private key, acts like an IAM User access key pair. To make things more frustrating, there is a lot of logic involved in determining the extent to which the fingerprint can change before the remote system is considered different, but that logic is all performed client side, as the use case for that check does not seem to be to mitigate attacker theft.
Another concern with this feature is that once session credentials for AWS are obtained, they are stored in a file on disk at /root/.aws/credentials. However, these are session credentials which do expire.
Something worth pointing out, which may be a core benefit or security concern to you is Systems Manager also enables you to manage the remote system if given the right set of privileges. This means you can deploy applications and obtain a terminal session on the remote system.
Which should you use?
Unless you have an established PKI that includes securely transferring certificates and methods of establishing trust with remote systems, Systems Manager hybrid activation provides compelling benefits, especially for one-off integrations with a handful of systems. The limited validity period and one-time use of the hybrid activation’s activation code are a significant risk mitigation. Furthermore, the system fingerprinting, although primarily checked client-side, does mitigate the ease with which engineers might otherwise freely copy credentials around from one system to another.
For larger deployments, IAM Roles Anywhere offers flexibility for IAM policy conditions on details of the certificates, which allows more granular authentication per machine when you don’t want to deploy individual roles for different use cases.
Both options would be improved significantly by AWS implementing PKCS11 so you can take advantage of TPMs to leverage Attestation Identity Keys for performing signing operations. Once that is done, the hybrid activation could then be improved by not storing the session credentials on disk. These could instead be accessible through a credential-process vending solution as is it more common for vulnerabilities to exist that allow arbitrary file reads versus those that allow RCE (in part as a result of the severity of arbitrary reads being generally considered to be lower and therefore teams de-prioritize fixing and patching them). The system fingerprint would similarly benefit from not being cached on disk.
Conclusion
There are now a number of options to help customers avoid having to rely on long-lived IAM user access keys. Those long-lived access keys have been a regular cause of security incidents, which could have been avoided if short-lived credentials were used instead, and if those credentials were more strongly tied to the systems that use them. This series has discussed a number of tactics for getting rid of IAM user access keys, mitigating their risks through least privilege and other means, and alternative authentication mechanisms to use instead. We hope to see less of them in the future, and as a result, fewer cloud security incidents!
See for yourself...
Learn what makes Wiz the platform to enable your cloud security operation
PyLoose is a newly discovered Python-based fileless malware targeting cloud workloads. Get a breakdown of how the attack unfolds and the steps to mitigate it.
Dynamic linker hijacking via LD_PRELOAD is a Linux rootkit technique utilized by different threat actors in the wild. In part one of this series on Linux rootkits, we discuss this threat and explain how to detect it.
In the previous post in this series, we discussed how to do some basic cleaning of AWS access keys. In this post, we’ll show how to reduce the privileges in order to mitigate their risk.
Get a personalized demo
Ready to see Wiz in action?
“Best User Experience I have ever seen, provides full visibility to cloud workloads.”
David EstlickCISO
“Wiz provides a single pane of glass to see what is going on in our cloud environments.”
Adam FletcherChief Security Officer
“We know that if Wiz identifies something as critical, it actually is.”
Greg PoniatowskiHead of Threat and Vulnerability Management