Cloud logging is a critical piece of any cloud detection and response program. Logs have always been an important source of information for the security operations center, and their importance has only increased with the rise in cloud attacks. Recent incidents, such as the Microsoft signing key compromise and the various TeamTNT campaigns, highlight the growing frequency of these threats. Effective cloud logging is crucial for identifying and mitigating attacks before they escalate into serious security breaches.

Managing cloud logs is essential for cloud security, but understanding the world of logging can be complex. Each cloud provider offers different log categories, each with unique names, formats, and configuration options. Understanding which logs you need to collect and ensuring all necessary logs are enabled can be quite difficult, especially when considering cost.

We’re here to help! Our comprehensive guide to cloud logging will introduce you to the different log types and showcase some tricks we’ve picked up over the years to optimize log configuration without straining your budget. Don’t miss the full guide here. In this blog post, we’ll preview some of the tips and tricks in the guide and give you a sneak peak into some of the key strategies and best practices.

Our cloud logging framework: Categorizing and prioritizing log collection

The approach to cloud logging we recommend in our guide is based on a framework we’ve developed, categorizing cloud logs by their security use case. Our framework maps mapped each cloud log source into categories: logs that are related to the identity, data, network, compute, and the control plane. Each category includes the most relevant logs that can assist in preparing for a cyber attack targeting the specific resources in the category. For example, to detect data exfiltration targeting S3 buckets, organizations would need to monitor S3 data events in CloudTrail.

Figure 1 - Logs by Categories

The framework provides a more approachable way to assess which logs are necessary to collect. In this blog, we’ll discuss the logs most relevant to detecting and investigating activity at the cloud control plane. The full guide contains tips, tricks, and recommendations for monitoring activity occurring across all the categories, and also details how these logs can be mapped to the MITRE ATT&CK framework to help you assess your coverage as it relates to your organizational threat model.

Control

If you have to prioritize which logs to collect, logs from the control category are the prime choice. These logs are often referred to as management logs, and are invaluable for capturing a wide spectrum of potential threats along the cyber kill chain. While other logs give insights to specific parts of the kill chain, management logs offer the most comprehensive coverage and give insights into all parts of a cyber attack, from initial access to impact.

What do these logs contain?

They cover control plane identity and resource management activities such as resource and user creation, deletion, and modification. Additionally, they record user and app sign-ins and read activities on resources, including listing all resources and retrieving metadata about specific resources. In all cloud providers, these logs are available by default in the portal/console.

Why should you collect them?

The management category offers the broadest coverage, providing visibility into multiple aspects of the kill chain. Most significantly, they excel at capturing cloud-native initial access and discovery techniques, offering proactive detection capabilities against potential attackers.

What can you detect using the logs?

Sign-ins from unusual locations/providers
Password brute force
Creation of users, roles, VMs or functions for persistence
Enumeration activity across multiple services
Unusual resource creations
Data descturction

Configuration Tips and Tricks

AWS - CloudTrail logs

Use Organization Trail - Logs are saved by default in the Cloudtrail console for 90 days. You can create a trail to extend the retention or to send the logs to a different location where you can integrate them with your detection solution. We recommend using the Organization trail option, which automatically creates a new trail for new accounts, ensuring comprehensive logging across all accounts within your organization. It's surprising how easy it is to lose track of newly created accounts lacking any logs, and organization trails are the solution to ensure you maintain visibility.
The first trail is free - You can deliver one copy of your ongoing management events to an S3 bucket for free using a trail.

Azure - Subscription activity logs and Entra sign-in and audit logs

Extend Retention - The default retention period for subscription activity logs in the portal is 90 days, while AD audit and sign in logs range from 7 to 30 days, depending on the license. In practice, longer retention periods are often needed when investigating threats. We highly recommend setting up a diagnostic setting to send the logs to an additional destination where they can be retained for longer.
No read activity - Azure control logs lack visibility into most control plane read activities, making them insufficient for detecting discovery attempts. Although you can find logs about Access to data resources in Resource level logs, actions such as listing resources, users and policies are not logged anywhere.

Google Cloud Platform

Audit Logs are Highly Centralized - unlike the other providers, GCP centralizes most log data within a single Audit Log instead of separating data into different log streams. Organizations can simply enable different event types within the single audit log stream.
Write Activity Logs are free - Audit Logs encompass several types: Admin Activity, Data Access, System Event, and Policy Denied. Admin Activity logs, which cover control plane write activity, are enabled by default and retained for 400 days, which is typically sufficient for most use cases.
Enable Read activity - The Admin Activity log only includes control plane write operations. To enable control plane read operation you need to enable Data Access “ADMIN_READ” logs for all services.
Additional Write Activity - It's worth noting that specific changes to resource configurations are logged in the Data Access "DATA_WRITE" logs, but these logs are not enabled by default. We will cover those logs in the upcoming sections.

Google Workspace

The five types of Google Workspace logs (Admin, Login, SAML, OAuth, Groups) are enabled by default and retained for 6 months in the Admin console.

Stream Logs to GCP - We recommend streaming Google Workspace logs into GCP audit logs. Streaming logs is available for free and allows the security team to consume them in a single location within the organization.
No Logs for Errors - It's important to note that all log types, except for the Login category, do not include error logs. This omission can complicate the detection of unauthorized attempts to modify identity resources.

{

  "appDisplayName": "Azure Portal",

  "appId": "c44b4083-3bb0-49c1-b47d-974e53cbdf3c",

  "resourceDisplayName": "Windows Azure Service Management API",

  "resourceId": "797f4846-ba00-4fd7-ba43-dac1f8f63013",

  "authenticationRequirement": "multiFactorAuthentication",

  "userDisplayName": "User Name",

  "userId": "11111111-1111-1111-1111-111111111111",

  "userPrincipalName": "user@company.comailto:user@company.co
",

  "ipAddress": "12.23.34.45",

  "location": {

    "city": "New York",

    "countryOrRegion": "US",

    "geoCoordinates": {

      "altitude": null,

      "latitude": 40.74839,

      "longitude": -73.9856

    },

    "state": "New York"

  },

  "signInEventTypes": [

    "interactiveUser"

  ],

  "status": {

    "additionalDetails": null,

    "errorCode": 0,

    "failureReason": "Other."

  },

  "userAgent": "Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion",

  ...

Figure 2 - Partial Azure User Sign-in log

An effective method for detecting compromised user accounts is to analyze sign-in logs such as the previous, particularly by identifying sign-ins from unusual countries or providers. Azure simplifies this process by including location information directly within the “location” object in the log.

Summary

Effective monitoring of cloud logs is essential for rapid threat detection and response. In cloud environments, log data is often the only way that organizations can gain real-time insights into their cloud environments, identify anomalies, and respond swiftly to potential incidents. This blog introduced our logging framework and discussed our first category of log collection: logs related to the control plane. For practical steps to help you optimize your logging configuration across all the categories in our framework, from selecting the right log types to implementing cost-effective strategies to ensure you’re getting the most value out of the logs you gather, make sure to check out our full logging guide here.

For comprehensive coverage, we recommend combining log-based detection methods with runtime detectors. This integrated approach enhances your ability to detect and respond to threats more effectively.

Cloud logging is a dynamic process that should evolve alongside your cloud environment. Continuously assessing your logging practices ensures they remain aligned with your changing detection and response needs. Recognizing that this can be a challenging task, various solutions are available to help you continuously evaluate log coverage and ensure that new resources are adequately monitored.

Cloud Logging Tips and Tricks: Everything You Need to Know

We review different log types and unveil our favorite tricks that we've picked up over the years to optimize logging configuration without straining budgets.

Get the guide

Cloud Logging Tips and Tricks