Kubernetes security refers to the measures and practices adopted to protect Kubernetes environments from unauthorized access and cyber threats. Kubernetes, being an open-source platform for automating the deployment, scaling, and operation of application containers across clusters of hosts, requires comprehensive security strategies due to its complex and dynamic nature.
Unfortunately, with all the complexity that Kubernetes brings, it also opens the door to many security risks, like your nodes, containers, or cluster control plane being compromised—risks that were not present with the traditional models of running servers.
In this article, we’ll take a look at best practices to solve the challenges that come with Kubernetes workloads.
Securing Kubernetes entails many aspects. Since multiple running components come together to form Kubernetes, you will have to secure each component separately. Below are a few of the steps you can take to better safeguard your Kubernetes components.
RBAC or role-based access control is present by default in Kubernetes. It is very easy to create a Kubernetes cluster with admin roles and start sharing these roles with everyone to create objects and make changes. However, this is a very risky practice since everyone will then have access to make any change in the entire cluster.
Take a case where someone is trying to test a few features in a local Kubernetes cluster and by mistake runs those commands in production because they didn’t change the context of the cluster. This can destroy your entire production environment.
You should always follow the minimum permission model for service accounts and users, and with Kubernetes, you can do this very easily up until the resource and action levels. Let’s look at the example below:
You can utilize Kubernetes namespaces to isolate developers from each other's workloads and give them their own space to explore. This looks somewhat like multi-tenancy from a developer’s perspective.
Pro tip
"The main motivation behind user namespaces in the container context is curbing the potential impact of container escape. When a container-bound process running as root escapes to the host, it is still considered a privileged process with its user ID (UID) equal to 0. However, user namespaces introduce a consistent mapping between host-level user IDs and container-level user IDs that ensures a UID 0 on the container corresponds to a non-zero UID on the host. In order to eliminate the possibility of UID overlap with the host, every pod receives 64K user IDs for private use."
With namespaces and RBAC implemented properly, you can achieve very granular control over what resources a developer can access in which environments. So you can very easily create conditions like a user being mapped to a certain namespace where they run the application and hence make changes in that particular namespace; they will also not be able to disturb any other namespace. Network policies adds a layer of security and follows the principle of least privilege.
You can use roles and role binding to achieve the same, as seen below. The role defines the access and permissions that a user has, while role binding helps in mapping a role to the groups a user belongs to:
The role defined above has access to the namespace “app1” on all apiGroups and all resources, and all the verbs are allowed. Next comes role binding, which maps the role NameSpaceapp1-FullAccess to the group name app-1-ns-group. This group can then be assigned to a user to access the particular namespace resources.
Use proper, verified container images
Kubernetes lets you run your workloads without the hassle of managing the underlying machines, autoscaling, and configuration management. But what if a workload has security flaws? This can cause multiple issues from inside the cluster. So, making sure that you get your images from verified sources and always use updated images is critical.
Adding image scanning in your build phase will help a lot. Also, make sure to deprecate older image versions, and always use the latest base images to build your container image.
You can also leverage tools like Falco and Trivy to scan images in your build phases.
Implement runtime container forensics
Scanning runtime containers can be a tricky task, but most issues are caused during runtime and may not be caught during the image scanning process. Wiz container forensics can help here, allowing you to scan your containers for any such problems.
Issues like privilege escalations, lateral movement, and whether or not an attacker gained persistence in an environment can be caught using Wiz container forensics.
Pro tip
"Taking a workload snapshot is not always sufficient. There are different runtime events that leave limited traces on disk such as fileless malware. Furthermore, threat actors attempt to erase any trace on the disk after carrying out malicious activity. Tracking runtime events on containers, nodes, and VMs can therefore enable a comprehensive workload-related investigation."
Everyone knows how important upgrades are in software engineering, but they are typically ignored, which leads to relying on multiple legacy systems. Upgrades don't just add new features; in most cases, they add a lot of security patches, too. Continuously upgrading your Kubernetes clusters to the newest version will patch your cluster to protect it against any new security threat that comes up.
Logs help you understand what is happening behind the scenes and what actions have been taken. Logs become really important when you’re facing any issue. With the help of logs, you can trace back any incident to the events that actually caused it.
This is key during security breaches since you can track down exactly what an attacker did to exploit your system and block their actions or system access. To do this, simply look at your API server logs, kubelet logs, and other object logs.
How to look at API server logs and events
In general, these logs are available at /var/log/kube-apiserver.log, but this can vary from cloud to cloud. All cloud providers have their own way to access logs and events. In AWS, logs are present via CloudTrail, and in GKE, you can follow these steps to see API server logs.
How to look at node logs and events
Node logs or kubelet logs are present at /var/log location only. You can access them by connecting to the machine using a simple tail command.
From Kubernetes version 1.27, node log query APIs are introduced, which you can use to view node logs. Below is an example of how to do this:
Pod logs and events are easy to view, as they are available for a long time via kubectl commands, for example:
kubectl logs -f pod_name -n namespace_name
Practice isolation
One of the best ways to add an extra layer of security is by isolating Kubernetes worker nodes and the API server from public networks. With Kubernetes, you can achieve this by not allowing any of your Kubernetes components to be accessible from outside your private network.
If you can restrict worker nodes from being accessed by a public network, you will greatly reduce the attack surface. Similarly, by limiting access to your API server to only those within your private network, you will boost your security posture.
Kubernetes Secrets are sensitive data such as passwords, tokens, and encryption keys that are essential for applications to function properly. If compromised, secrets can expose sensitive data and lead to unauthorized access, data breaches, and other security incidents. A few Kubernetes security best practices for secrets include:
Avoid Storing Secrets in Environment Variables or Configuration Files: Never store sensitive data directly in environment variables or configuration files. These are not secure locations and could be easily exposed if accessed by unauthorized parties.
Rotate Secrets Regularly: Regularly rotate secrets to reduce the risk of exposure and prevent unauthorized access from persisted credentials. Use tools like kubectl rotate secret or external secret managers to automate secret rotation.
Avoid Hardcoding Secrets in Code: Never hardcode secrets directly into code. Instead, use environment variables or configuration files that can be securely retrieved from Secret object
Enable audit logging
In multiple scenarios, when you create a Kubernetes cluster, the auditing function is not on. Make sure that it is, and if possible, ship the audits and logs to one place for the most effective correlation and analysis.
Auditing is key to making sure that any changes rolled out are up to the mark and do not expose your system to exploitation.
Apply CIS benchmarks
Kubernetes hardening is another step to make sure you have proper controls in place to avoid any vulnerabilities and leaks. These are rules and guidelines that are required by multiple certifications like PCI DSS, HIPAA, and NIST. These contain multiple controls at the network, access, and configuration levels and are generally maintained at the agent or node level.
It is no longer a question as to why you need Kubernetes security. The more important question here is how you can achieve it and what the different components are that you’ll need to test.
Kubernetes comprises two sets of programs or processes:
control-plane processes
worker or data-plane programs.
Control-plane components
The control plane consists of etcd, a consistent database where all the configurations are stored. Then comes the API server. It can talk to etcd, while every other component talks to the API server for information or updates; no other component can talk to the etcd servers. The next key component is the controllers; these are useful for different things like scheduling a pod or creating disks and nodes in the cloud.
Worker-node components
These include the kubelet, which is useful for launching the workload, making sure it is running, and seeing whether the containers are dying and need to be restarted. The kube-proxy creates proper Iptable rules so traffic can be routed to the services. Lastly, container network interface (CNI) plugins are important for pods to connect to the network and make traffic routable.
It is very important to first secure the API server and etcd. These two, if exposed, can cause havoc in your system since they have direct access to the Kubernetes configurations.
After this, you have to make sure that the network-level configurations are in proper shape so that no one can access your Kubernetes network. After this, you have to make sure that the network-level configurations are in proper shape so that no one can access your Kubernetes network. This includes using network policies to control which pods can communicate with each other.
Shared responsibility
Shared responsibility is a very important step in maintaining the security of your whole system. Building software is a collaborative effort by many members from different teams, each of which handles different aspects of software engineering. So it becomes really important for teams to understand their role and responsibilities toward the security of the Kubernetes infrastructure.
You need to clearly understand how changes by different teams to containers, images, networking, and deployment can impact security; this will help your overall security posture.
Most of a Kubernetes workload is volatile in nature; this means that if you launch a pod, it may get deleted and a new one will pop up in its place. This makes traditional methods of security obsolete here.
Earlier security scripts or Ansible scripts published by CIS can be run to keep VMs’ security intact and to apply patches. But in Kubernetes, if you run such scripts in the pods, the pods may get replaced and erase any security patches or changes made.
This problem is solved by adding security patches when designing your pod specs and building images. Gating also helps a lot in such immutable infrastructure; if anything created passes through a gate for a security check, you can ensure an extra layer of protection and that some vulnerability scans were performed. This will, of course, depend on the implementation and the number of steps you put in your security pipelines and gates.
Fortunately, Kubernetes can be very easily defined as code. If you keep all the code in the IaC pipelines and use proper security checks, it becomes very easy to manage Kubernetes.
All of the above are basic steps that you should take to make sure your Kubernetes clusters are secure. No one should be able to access your clusters from a public network, and with RBAC, not just anyone should be able to make changes to a resource.
Auditing will help catch and understand security issues. Wiz's cloud-native Kubernetes Security solution can be a great auditing tool if you want to scan and check your Kubernetes clusters for security vulnerabilities:
Scan all your containers, hosts, and Kubernetes clusters to gain full visibility into your containerized environment without any blind spots.
Correlate audits and logs, and create a prioritized view of container risks to catch misconfigurations, vulnerabilities, public containers, excessive permissions, and exposed secrets.
Proactively remove containers at risk, and stop attacks on your environments.
Shift security toward developers by helping them identify containers with security concerns during the image build time.
Enforce image signing to validate that only signed container images can be used in production.
Use admission controllers to block the misconfigured pods or actions before they are deployed.
Scan the YAML to uncover vulnerabilities in the code.
Leverage real-time threat management via a dashboard with all relevant information.
Detect real-time malicious behavior in Kubernetes clusters
Learn why CISOs at the fastest growing companies choose Wiz to secure their Kubernetes workloads.
Data access governance (DAG) is a structured approach to creating and enforcing policies that control access to data. It’s an essential component of an enterprise’s overall data governance strategy.
Cloud data security is the practice of safeguarding sensitive data, intellectual property, and secrets from unauthorized access, tampering, and data breaches. It involves implementing security policies, applying controls, and adopting technologies to secure all data in cloud environments.
SaaS security posture management (SSPM) is a toolset designed to secure SaaS apps by identifying misconfigurations, managing permissions, and ensuring regulatory compliance across your organization’s digital estate.
Data risk management involves detecting, assessing, and remediating critical risks associated with data. We're talking about risks like exposure, misconfigurations, leakage, and a general lack of visibility.
Cloud governance best practices are guidelines and strategies designed to effectively manage and optimize cloud resources, ensure security, and align cloud operations with business objectives. In this post, we'll the discuss the essential best practices that every organization should consider.