A few weeks ago I saw this tweet from Dr. Anton Chuvakin, where he asked which vulnerabilities in recent memory have inflicted the most pain to security teams. This was a good question, and it got me thinking: what actually makes a vulnerability “painful”?
Certainly the most obvious factor is a vulnerability’s severity, often determined by its CVSS score (which isn’t always a reliable metric but is arguably still very useful). If a severe vulnerability is exploited in an organization’s environment, the impact could be significant, and the harm caused to both the organization itself and its customers could be very bad. Beyond severity, there are also other various factors to consider that can help us determine whether a vulnerability is worth our time and effort.
However, putting aside the cost of successful exploitation, preventative efforts to mitigate severe vulnerabilities can be both time-consuming and time-sensitive, especially if a vulnerability is known to be exploited in the wild or already has a public exploit available. The assumption here is that the cost of fixing a vulnerability ahead of time will be lower than the cost of it being exploited by a threat actor, multiplied by the size of an organization’s attack surface, which potentially includes every single vulnerability in the environment that needs to be fixed.
Therefore, prevalence is another important pain factor: more affected workloads mean more potential vulnerability findings to triage. For highly prevalent fundamental pieces of software (such as Python, cURL or glibc) deployed in a large enough environment, this can mean tens of thousands of workloads. Every single one of these instances needs to be assessed and triaged, and security teams must methodically decide which cases can be safely ignored and which require immediate attention.
Since organizations must spend a great deal of time responding to vulnerabilities in order to avoid being impacted by exploitation, this means that a third (less obvious) pain factor is how difficult it is to respond to a given vulnerability. This may not be a very flashy topic of discussion, but hard-to-remediate vulnerabilities can result in an enormous opportunity cost for security teams, so I think it’s worth exploring further.
A few words on vulnerability response It might come as a surprise that patching isn’t always a simple thing to do, and “just patch” isn’t universally practical security advice for responding to newly discovered vulnerabilities. Fixing a vulnerability in a production environment generally requires going through several steps, each of which has a difficulty level that tends to scale with the size of the organization:
Assessing the environment to figure out if it’s affected, which requires sufficient inventory and up-to-date observability into what technologies are deployed and what versions they’re running, in order to map all vulnerable instances.
Checking which vulnerable instances are actually exploitable, so they can be prioritized for fixing before others (just because an application is technically vulnerable to a given CVE doesn’t necessarily mean that an attacker has any practical way of exploiting it).
Deciding on the most appropriate fix, as there might be several patches or workarounds available, and some might be easier to deploy than others.
Testing that the chosen fix method won’t break any critical functionality (or compensating for that somehow).
Finally, applying the fix to every affected workload while prioritizing those at highest risk (besides focusing on exploitable cases, organizations might choose to use contextual information to prioritize publicly exposed workloads, those containing sensitive data, etc.).
Since this process is more complicated than it might initially seem, let us review all the things that make some vulnerabilities harder to respond to than others, while taking the opportunity to point out steps that vendors can take to make this process easier for security teams. The rest of this blogpost is structured as a list of questions one should ask in order to estimate how “painful” any given vulnerability might be for your organization to deal with (see the following table for a quick summary).
Question | Takeaway |
---|
How much time do defenders have to respond? | Vendors should pre-announce risky vulnerabilities and assign them CVEs, to give defenders a head-start on threat actors. |
How hard is it to know if a product is affected? | Vendors should clarify if a vulnerability is rooted in a dependency and publish work-in-progress advisories while they evaluate if their products are affected. |
How hard is it to check if an instance is vulnerable? | Vendors should make it as easy as possible for customers to determine what product version they’re running. |
How hard is it to check if an instance is exploitable? | Vendors should make exploitability conditions absolutely clear in their advisories, and expose configuration flags so that they’re simple to detect during assessment. |
How likely is a fix to break functionality? | Vendors should develop dedicated patches that only address a vulnerability without affecting any other functionality. Vendors should also provide guidance on workarounds to avoid patching entirely, if possible. |
How hard is it to apply a fix? | Vendors should make patching a simple and automatable process and provide multiple options to accommodate different preferred patching methods. |
How effective is the fix in the long-term? | Vendors should always aim to address root causes. For zero-day vulnerabilities being exploited in the wild, vendors should consider initially releasing a surface-level fix and then a secondary more robust patch to mitigate variants. |
What is the direct monetary cost of applying a fix? | Vendors should backport patches for especially risky vulnerabilities affecting especially prevalent end-of-life versions of their products. Organizations should avoid using end-of-life products in the first place so that they won’t find themselves being forced to purchase an upgrade under stress. |
Aspects of vulnerabilities that make them harder to deal with... and how vendors can make things easier#1 – How much time do defenders have to respond to the vulnerability? When a dangerous vulnerability is publicly disclosed, defenders must often scramble to apply fixes while both threat actors and security researchers are busy studying the update (patch-diffing) to develop a working exploit which might be used against slow-to-patch targets.
The stress of vulnerability assessment and response can be alleviated somewhat when vendors announce vulnerabilities ahead of time. This entails the vendor stating that an upcoming patch will include a fix for a vulnerability, without including any details about the vulnerability itself. The vendor then publishes the patch at a later predetermined date and time.
This pre-announcement gives organizations a head-start on threat actors, so that security teams have more time to inventory their environments in preparation of deploying a fix when it becomes available (assuming they have the tools to conduct an inventory) – everything takes longer when you’re not prepared. Some recent notable examples of this approach are vulnerabilities affecting OpenSSL and cURL.
Vendors should always assign such pre-announced vulnerabilities a CVE (so that everyone knows what everyone else is talking about). Additionally, it’s helpful to announce affected version ranges ahead of time, though for narrow ranges this information might serve as a hint for threat actors to discover the vulnerability on their own. Note that for zero-day vulnerabilities being exploited in the wild, pre-announcing simply isn’t a viable option, since threat actors already have a head-start.
#2 – How hard is it to know if a product is affected by the vulnerability? If a CVE only affects one product, then there usually isn’t much room for error or interpretation. Conversely, there is an inherent challenge in responding to vulnerabilities affecting libraries (such as log4j, Spring Framework or OpenSSL), which can transitively affect a large number of dependent products. This is because the responsibility is on vendors to assess their own product’s susceptibility, which is determined by whether and how the product utilizes the library. For example, while many products might incorporate log4j, not all of them were susceptible to Log4Shell, since the vulnerability wasn’t necessarily exploitable within every application’s context.
Confusion can arise if a vendor assigns a CVE to their own product while the root cause is in fact a bug in a dependency utilized by other products as well (such as the recent case of CVE-2023-4863, which was initially reported to only affect Chrome, but actually affected the WebP library). In this scenario, it might take time for other vendors to realize that their own products are also affected, giving attackers an even larger time window in which the vulnerability can be exploited with no patch available for most affected products. To make matters worse, by this point a certain narrative has often already been established around the CVE (e.g., “this only affects Chrome”), making it harder to build public interest and drive remediation efforts.
Regardless, vendors must publish advisories noting whether their products are susceptible or not and release a patch if necessary. However, this evaluation can be complex and might take time (especially when vendors don’t receive prior notice of an upcoming vulnerability), and until then customers are left to wonder whether their environments are affected. In the interim, vendors should consider publishing an initial work-in-progress advisory stating that they’re working on evaluation.
When lacking any official information, customers might decide to assume the worst and patch the dependency themselves. However, doing so might unintentionally break functionality in the dependent product, or it might not be possible anyway — for example, in instances where the dependency is statically linked. Alternatively, customers might choose to temporarily cordon off any potentially affected workloads in their environment, thereby making services unavailable and incurring additional operational costs.
#3 – How hard is it to determine whether an instance of the affected product is vulnerable? Vulnerability assessment is usually based on identifying the software or firmware version of an instance of the relevant product and comparing it to a list of affected version ranges. But extracting the version from the product isn’t always straightforward, and some products can make this especially challenging depending on what assessment method an organization is using (binary scanning, unauthenticated scanning, registry key lookups, package manifest lookups, etc.). For example, not all products expose their version in response to an unauthenticated scan (which is arguably a good thing), nor do all products explicitly list their version in their binary metadata (which is quite obviously not a good thing).
Another issue can arise when applying a fix creates discrepancies between versions extracted from different sources of information, such as:
These ad-hoc solutions can certainly be very helpful, but they must be accounted for during vulnerability assessment, otherwise workloads with a fix applied might mistakenly resurface as vulnerable.
#4 – How hard is it to figure out if a specific instance is actually exploitable? Some vulnerabilities are only exploitable under specific circumstances, such as when certain configurations are enabled. On one hand, this is good news, since it lowers the likelihood of exploitation. But on the other hand, vulnerability assessment must be used in conjunction with configuration assessment. To further complicate things, some products don’t have declarative configurations, or they might not expose their settings in a way that makes them verifiable to external detection methods (such as unauthenticated scanning), or their configurations might be applied at runtime, making disk-based configuration scanning less effective.
Similarly, some vulnerabilities are only exploitable in specific usage patterns, meaning that their significance depends on how an organization’s environment is set up and how the affected product is being utilized. For example, CVE-2023-38408 affecting OpenSSH was strictly a client-side vulnerability, meaning that any vulnerability findings on servers could be safely considered benign positives. Differentiating between clients and servers — or any other usage pattern — can be difficult to define from a technical perspective, so this distinction isn’t always accounted for in vulnerability assessment processes. However, vendors should at the very least make such complex conditions absolutely clear in their advisories, and help customers better understand this type of nuance.
#5 – How likely is a fix to break functionality? Patches for a critical vulnerability should ideally be entirely dedicated to fixing the vulnerability and nothing else — this will simplify patch testing and lower the chances of a patch breaking functionality in any given environment.
Additionally, vendors should provide guidance for applying temporary workarounds that prevent a vulnerability from being exploited, such as by modifying certain configurations, so that customers can delay applying a patch until they have had time to properly test it in their environment.
However, note that much like in-memory patching, when a fix is not an actual patch but rather applying a workaround (which doesn’t change the extracted version), this must be accounted for during reassessment.
#6 – How hard is it to apply a fix? Sometimes applying a patch is very simple and straightforward, and only involves logging into an affected system with sufficient privileges and running an executable or script. The less manual work involved, and the easier it is to automate the patching process, the better.
Conversely, remediating some vulnerabilities requires additional steps which might complicate things, such as manually inputting commands, rebooting the affected system (which can incur operational costs), or even replacing it entirely if the vulnerability has already been exploited (such as in the recent case of CVE-2023-2868, a vulnerability affecting Barracuda ESG). Remediation generally tends to get more complicated when dealing with physical appliances (though I might just be suffering from on-prem-o-phobia).
Ideally, vendors should provide multiple methods of applying a fix, so that customers can choose the most appropriate one for their environment. As mentioned above, vendors should also provide workarounds to accommodate customers who cannot immediately apply a patch.
#7 – How effective is the available fix in the long-term? When vendors don’t address the true source of a vulnerability, instead applying a surface-level patch that only fixes a highly specific bug, they leave the door open for bypasses. These are (relatively) easy-to-discover variants of the same vulnerability that simply exploit a slightly different path. Vendors have a responsibility to address root causes, otherwise their customers will find themselves having to spend resources remediating multiple vulnerabilities instead of just one. This is especially true when it comes to run-of-the-mill (non-zero-day) vulnerabilities that aren’t being actively exploited in the wild — in such cases, vendors should be expected to take whatever time they need to release a single robust patch.
Conversely, when it comes to zero-days, releasing a simple patch for a known-to-be-exploited bug as fast as possible is justifiably the top priority for vendors, and it’s better to quickly release a “naïve” patch than take too much time to develop one that solves the most underlying problem. In such cases, it makes sense that customers should first apply an emergency zero-day patch and then later apply a secondary more robust patch when the vendor has figured out the best long-term solution.
#8 – What is the direct monetary cost of applying a fix? Finally, in some cases, organizations may be using an end-of-life licensed product which no longer receives security patches. This very much goes against best practices, but in such cases, there is obviously a direct monetary cost to fixing a vulnerability, since it would require purchasing an upgrade for the affected product (assuming it’s still supported) or replacing it with something newer. There really isn’t much more to be said about this — organizations should avoid using end-of-life products, especially on workloads at higher risk of exploitation.
Vendors can and do sometimes backport patches for end-of-life versions of their products, though this is usually reserved for especially risky vulnerabilities affecting especially prevalent older versions (for example, VMWare recently patched end-of-life versions of VMware vCenter Server affected by a critical vulnerability).
Summary Vulnerability management can be stressful and time-consuming, but some vulnerabilities cause more headaches than others. The most “painful” vulnerabilities tend to be those with high or critical severity; that affect very prevalent products; and pose challenges for assessment and/or remediation.
However, vendors can take meaningful steps to make things easier for security teams, mainly by giving them more time to prepare for an upcoming patch; saving them time by streamlining detection and patching processes; and communicating as clearly as possible about exploitability conditions.
This blog post was written by Wiz Research, as part of our ongoing mission to analyze threats to the cloud, build mechanisms that prevent and detect them, and fortify cloud security strategies.