Introduction

Recent incidents like the Deepseek AI database leak highlight how misconfigurations in cloud-hosted databases can expose sensitive data (Blogpost) . As organizations increasingly rely on cloud-native services, securing these environments is critical. The shared responsibility model places the onus on organizations to properly configure and monitor cloud infrastructure, making misconfigurations a leading cause of data exposure. A key challenge is maintaining continuous visibility—where is sensitive data stored, who has access to it, and how is it secured? Best practices and OOTB security controls provided by vendors help reduce misconfigurations, but security teams must also proactively monitor for drifts and policy violations.

ClickHouse, an open-source column-oriented database, offers built-in security features to help protect data. However, as cloud environments scale, security teams need deeper insights into potential risks, including misconfigurations and excessive access. By combining ClickHouse’s best practices with Wiz’s cloud security platform, organizations can continuously map their data exposure, detect risky configurations, and enforce security policies before an issue leads to a breach.

Best practices with Clickhouse

Note- While this blog post focuses on security best practices for open-source (OSS) ClickHouse, these same principles are foundational to how we secure ClickHouse Cloud - where we go even further with additional managed protections. By following these guidelines, self-hosted users can align with the same security standards we enforce in our cloud environment, ensuring robust data protection whether running ClickHouse on-prem or in the cloud.

In the recent blogpost, The Wiz Research team was able to identify a publicly accessible ClickHouse database belonging to DeepSeek. Thanks to this effort, the DeepSeek team was able to secure their instance immediately.

At the time of writing, there are over thirty thousand ClickHouse instances reachable from the internet.

As ClickHouse becomes more mainstream, supporting a wide range of use cases including ML/AI, Observability and Analytics, it is crucial for developers and engineers to know how they can protect themselves from making the same mistakes.

Due to this reason, the ClickHouse engineering team, in collaboration with Wiz, is reflecting on the recent breach and out-of-the-box defaults as well as sharing practical tips to help users protect their data in this blog post.

Reflecting on the breach

Beyond the immediate impact on reputation and customer trust, the DeepSeek breach also carries potential financial, legal, and regulatory consequences. The visibility of this incident was particularly high, trending on Hacker News for several days, raising broader concerns about database security.

This naturally leads to the question—how did this happen, and what went wrong? As the maintainers of ClickHouse, we must also ask ourselves: Could we have done something better?

At first glance, the issue seems simple—the database should not have been publicly exposed. While that may be true for this specific deployment, ClickHouse can be safely deployed and exposed to the public when configured correctly. This is demonstrated by our public demo environments, where users can freely query datasets, even accessing underlying databases directly for analysis. Additionally, standard services in ClickHouse Cloud (excluding Private Link) are publicly accessible for user connections, with the option to add IP filtering.

The key difference is that in these properly configured deployments, we enforce several best practices, including enabling TLS, securing the administrator account with authentication, and applying strict authorization policies for API users. In DeepSeek’s case, none of these protections were in place, leaving the instance completely unprotected.

> Note that in the case of public demos (for example, https://sql.clickhouse.com/), we share password-less credentials with our users, but enforce strict access controls through RBAC, as well limits on query complexity and quotas to prevent abusive usage.

The next question might be: how easy is this to do as a new user of ClickHouse? Was this simply the result of poor configuration defaults for which other databases have suffered?

ClickHouse - Out-of-the-box security controls

ClickHouse implements multiple security controls during installation to prevent misconfigurations that could leave an instance publicly accessible without authentication or encryption. However, this incident has also highlighted opportunities for further strengthening our default protections.

During installation, administrators are prompted to set a password for the default user. If they choose to proceed without one, an explicit acknowledgment is required, ensuring they are aware of the security implications.

> Why allow users to skip authentication? This is a trade-off between security and usability. While the default user is restricted, this approach supports valid use cases like local testing and development, ensuring ease of setup without compromising control when stricter security is required.

However, in reviewing the DeepSeek incident, we recognized that the default user in Docker deployments could be created without a password. While this still required users to explicitly publish ports to expose the instance, it introduced a potential risk. To address this, we have now disabled all network access for the default user by default in official ClickHouse images. This change has been backported to the last three releases and two LTS versions, ensuring improved security for existing deployments.

Note that a standard ClickHouse installation does not expose the server to external networks, even if no password is set for the default user during setup. To allow external access, users must manually adjust ClickHouse’s network settings.

This behavior can be seen in this part of the code, preventing external access when no listen_host setting is configured. Any changes to this configuration come with an explicit warning, emphasizing the need for careful consideration before exposing the instance to external connections.

The screenshot below demonstrates the default configuration where all examples of listen_host configurations are commented out by default, as the result, ClickHouse server falls back to only listens on the localhost interface.

While this limits direct exposure, there is still a risk—reverse proxy attacks, for example, could inadvertently expose the instance. To mitigate this, we strongly recommend setting a password for the default user as a baseline security measure.

In summary, the following misconfiguration was introduced to DeepSeek’s ClickHouse instance:

Publicly exposed without restrictions – The instance was accessible from the internet without IP filtering or binding to a private interface, leaving it vulnerable to unauthorized access.
No SSL/TLS encryption – Traffic was unencrypted, despite ClickHouse supporting TLS with Let’s Encrypt, making it trivial to secure connections.
Default user without a password – Authentication was not enforced, a critical misstep that should have been addressed during installation.

These represent the bare minimum security measures we recommend. While such configurations may be acceptable for local development and testing, production deployments require stricter security controls. Fortunately, ClickHouse has evolved over 15 years as a mature open-source project, offering robust security features to help teams safeguard their data effectively.

Securing a ClickHouse database

Out of the box, the OSS version of ClickHouse offers features such as:

Authentication & Access Control – Enforce secure authentication with password policies, certificate-based logins, SSH keys, and external providers like LDAP and Kerberos.
Role-Based Access Control (RBAC) – Grant fine-grained permissions to users and roles, ensuring controlled access to databases, tables, and system resources.
Query Quotas & Execution Limits – Prevent excessive resource consumption and data exfiltration by restricting query complexity, memory usage, and result sizes.
Network Security – Secure data in transit with SSL/TLS and mTLS, restrict network access with firewall policies, and limit exposure by configuring ClickHouse to listen only on required interfaces.
Data Encryption & Protection – Protect data at rest with virtual file system encryption and support secure storage on external disks (e.g., S3) to mitigate file system attacks.
Secret Management – Store and manage credentials securely using named collections, IAM roles, and automatic query masking to prevent credential leakage.
Auditing & Monitoring – Maintain visibility with detailed logs of authentication attempts, queries, and system activity, ensuring compliance and detecting suspicious behavior.

We’ll explore each of these in a little more detail and provide guidance when each is appropriate.

Authentication & Access Control

By default, authentication relies on passwords, with users strongly encouraged to set a password for the default user during installation as emphasized earlier. Passwords are hashed using SHA-256 with a salt, preventing them from being stored in plain text.

Over time, we’ve improved password hashing mechanisms, initially supporting SHA-256 with salt and double SHA-1 before introducing bcrypt, a slower hashing algorithm designed to meet industry best practices. Additionally, password complexity rules can be enforced to comply with NIST 800-63B Authenticator Assurance Level 1.

CREATE USER sec_user IDENTIFIEDWITH bcrypt_hash BY '$2y$10$3y43Ch......J9klbw4Twhy8YLm'

Beyond passwords, ClickHouse supports other authentication methods suited for different security needs. Certificate authentication allows secure, encrypted connections between services and ClickHouse without relying on traditional passwords.

CREATE USER sec_user IDENTIFIEDWITH ssl_certificate CN 'serviceA.customer.com'

More recently, SSH key authentication has been added, making it easier for developers to securely work with ClickHouse while avoiding credential management.

CREATE USER sec_user IDENTIFIEDWITH ssh_key BY KEY 'AAAAC3NzaC1lZDI...AtNYgwncUnjaSl4e1Od' TYPE `ssh-ed25519`

For self-managed enterprise environments, ClickHouse integrates with external authentication providers, including LDAP for centralized authentication, HTTP endpoints for custom authentication flows, and Kerberos for single sign-on.

Finally, ClickHouse also provides additional controls to further restrict access. The VALID UNTIL setting ensures temporary access is automatically revoked after a specified period, reducing the risk of stale credentials. Authentication can also be restricted by IP using the HOST setting, preventing unauthorized access from untrusted locations.

CREATE USER sec_user_5 IDENTIFIEDWITH bcrypt_hash BY'$2y$10$3y43Ch......J9klbw4m'VALID UNTIL'2025-03-01'HOST IP '10.1.1.0/24'

These authentication and access control mechanisms ensure that ClickHouse can be deployed securely while remaining flexible enough to support a wide range of use cases.

Managing Permissions with RBAC

ClickHouse provides fine-grained Role-Based Access Control (RBAC), allowing administrators to precisely manage permissions by granting and revoking privileges at the user or role level to databases, tables and system resources.

Permissions can be assigned to individual users or roles, making it easier to manage access across multiple accounts. Administrators can grant specific privileges, such as SELECT, CREATE, and INSERT, on particular databases or tables. They can also revoke permissions when necessary to maintain security.

GRANT SELECT ON system.* TO`developer_role`;REVOKE SELECT ON system.query_* FROM`developer_role`;GRANT SELECT, CREATE, INSERT ON `developer_db` TO `dev_user`;

Multiple roles can also be assigned directly to users or even nested within other roles to simplify permission management.

GRANT `developer_role`, `analytic_role` TO `dev_user`;

For cases where users need to delegate permissions, ClickHouse supports WITH GRANT OPTION, allowing them to pass on granted privileges to others:

GRANT * ON `developer_db` TO `dev_user` WITH GRANT OPTION;

Beyond granting access at the database or table level, ClickHouse also supports both static and dynamic row-level security policies. These policies ensure that users only see specific subsets of data based on predefined conditions. This is particularly useful when restricting access to sensitive records while allowing broader queries on the same table.

-- Static Row Policy: Grants access to rows where security_tag = 'analytic'CREATE ROW POLICY analytic ON secure.data USING security_tag='analytic' TO analytic_role

-- Dynamic Row Policy: Grants access based on user-specific tags retrieved dynamically from HTTP RequestCREATE ROW policy header_fun ON secure.data FOR SELECT USING security_tag IN ( SELECT tag from security_tag_reference WHERE user=getClientHTTPHeader('X-USER') ) TO grafana

Query quotas and complexity limits

When exposing ClickHouse instances to untrusted users or restricting internal users from consuming excessive resources, administrators can enforce query quotas and complexity limits to prevent performance issues and unauthorized modifications. These measures ensure controlled resource usage while maintaining system stability.

To prevent accidental changes in production environments, administrators can enforce read-only access for certain users by setting `readonly=1`. This is ideal for service accounts or roles that should only perform queries without modifying the data.

CREATE USER sec_user_5 IDENTIFIEDWITH bcrypt_hash BY'$2y$10$3y43ChJKzPN15xbS400e6eQSgDtV7.zqU0mffFJ9klbw4Twhy8YLm' readonly=1;

Beyond read-only restrictions, ClickHouse provides quota management to prevent excessive data retrieval and limit system resource consumption. Administrators can configure limits on parameters such as max_rows_to_read and max_memory_usage. These features are extensively used in public demo environments and are recommended for users exposing ClickHouse instances externally. Additional limits enforced by max_result_rows, `max_result_bytes` can help prevent the large exfiltration of data.

CREATE USER sec_user_5 IDENTIFIEDWITH bcrypt_hash BY'$2y$10$3y43Ch......J9klbw4m' VALIDUNTIL'2025-03-01' HOST IP '10.1.1.0/24'SETTINGSmax_memory_usage=100000, max_rows_to_read=10000, max_result_rows=10;

For more scalable access control, settings profiles allow administrators to group multiple constraints into a single profile, which can then be assigned to users or roles. This simplifies configuration management and ensures consistency across different user groups.

CREATE SETTINGS PROFILE `play` SETTINGSreadonly = 1, max_execution_time = 60, max_rows_to_read = 10000000000, max_result_rows = 1000, max_bytes_to_read = 1000000000000, max_result_bytes = 10000000, max_network_bandwidth = 25000000, max_memory_usage = 20000000000, max_bytes_before_external_group_by = 10000000000, enable_http_compression = true--assign settings profile to the roleALTER USER play SETTINGS PROFILE play_role

By applying query quotas, read-only restrictions, and settings profiles, administrators can confidently expose ClickHouse to external users while ensuring fair resource distribution.

Networking

When exposing ClickHouse to external users or limiting internal access, network security plays a crucial role in preventing unauthorized access and data breaches. Basic best practices include using firewalls and restricting security groups to only allow traffic from legitimate sources.

For ClickHouse itself, administrators should enforce TLS encryption to secure data in transit and prevent interception. As noted in the Deepsesk breach retrospective, server network interfaces are equally important - ClickHouse should only listen on explicitly required addresses, disabling any unnecessary interfaces to reduce the attack surface.

<listen_host>10.1.1.100</listen_host>

For production deployments, especially those exposed to external traffic, enabling TLS is essential. As described in our documentation, securing client connections is straightforward thanks to Let’s Encrypt certificates.

https_port: 8443tcp_port_secure: 9440openSSL:server:certificateFile: '/etc/clickhouse-server/fullchain.pem'privateKeyFile: '/etc/clickhouse-server/privkey.pem'disableProtocols: 'sslv2,sslv3,tlsv1,tlsv1_1'

By following these best practices administrators can significantly enhance ClickHouse’s security while maintaining accessibility for legitimate users.

Data protection

When handling sensitive data, encryption plays a critical role in protecting ClickHouse instances from unauthorized access. ClickHouse supports encryption at rest and built-in encryption functions for data security at different levels.

Encryption at rest is achieved through virtual file system encryption and the ability to store data on external disks such as S3. This ensures that even if an attacker gains access to the underlying file system, they cannot directly extract sensitive data. Storing data externally also mitigates risks from Local File Inclusion (LFI) and adds an additional layer of defense by slowing down the attacker in the event of a Remote Code Execution (RCE) attack.

In addition to encryption at rest, ClickHouse provides built-in functions to encrypt and decrypt data at query time, allowing developers to control access to sensitive information dynamically. The example below demonstrates how to use AES-256 encryption and decryption within a query:

WITH (SELECT encrypt('aes-256-ofb', 'ClickHouse has security functions', 'a_32characterkeylengthisrequired') )ASencrypt_tokenSELECT decrypt('aes-256-ofb', encrypt_token, 'a_32characterkeylengthisrequired');

┌─decrypt('aes-256⋯ [HIDDEN id: 2])─┐

│ ClickHouse has security functions │

└───────────────────────────────────┘

Secret management

When integrating ClickHouse with external services such as S3, PostgreSQL, or MongoDB, credentials are often required.

For self-managed deployments, ClickHouse provides named collections, allowing credentials to be securely stored and referenced within queries instead of being part of the query.

> In ClickHouse Cloud environments, IAM roles can be used for authentication when accessing S3, removing the need to store static credentials.

This reduces the risk of exposing sensitive information in logs or query history. Named collections can be configured in config.xml and referenced in queries:

<clickhouse>

<named_collections>

<s3_mydata>

<access_key_id>AKIAIOSFODNN7EXAMPLE</access_key_id>

<secret_access_key>wJalrXEXAMPLEKEYEXAMPLEKEYEXAMPLEKEY</secret_access_key>

<format>CSV</format>

<url>https://s3.us-east-1.amazonaws.com/yourbucket/mydata/</url>

</s3_mydata>

</named_collections>

</clickhouse> –or via DDL

CREATE NAMED COLLECTION s3_mydata AS

access_key_id = 'AKIAIOSFODNN7EXAMPLE',

secret_access_key = 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',

format = 'CSV',

url = 'https://s3.us-east-1.amazonaws.com/yourbucket/mydata/'

Queries can then reference the named collection instead of including credentials directly:

INSERT INTO FUNCTIONs3(s3_mydata, filename = 'test_file.tsv.gz', format= 'TSV', structure = 'number UInt64', compression_method = 'gzip')SELECT * FROM numbers(10000);

Additionally, ClickHouse automatically masks credentials in logs when table functions require authentication.

To further enforce security, query masking rules can be applied via regex to sanitize logs, preventing accidental exposure of sensitive information. These rules can be configured in config.xml under query_masking_rules.

By combining IAM roles, named collections, log masking, and query masking rules, ClickHouse ensures secure credential management while minimizing exposure risks.

Auditing

For security purposes, the logging tables in ClickHouse’s system database provide a detailed audit log, allowing administrators to easily analyze past events.

In system.session_log, administrators can find information about successful and unsuccessful authentication attempts such as event time, source IP address, the type of client used, the network interface they tried to connect.

With system.query_log, administrators can review past user queries with detailed information, including the query timestamp, execution time, number of rows and bytes returned, query string, originating IP address the client used to execute the query and many other helpful information.

In the event of a Denial of Service attack, crash_log, error_log and text_log could assist administrators in identifying the root cause and implementing mitigation measures.

Assurance

ClickHouse OSS serves as the foundation for our SaaS platform, ClickHouse Cloud. As a result, our engineers always maintain a high standard for our open source project to ensure stability and security for both OSS users and our Cloud customers. Our commitment is demonstrated through continuous fuzzing effort and a bug bounty program that encourages security researchers to identify and report potential vulnerabilities.

Building on this strong foundation, ClickHouse Cloud enables customers to harness the performance and scalability of ClickHouse with confidence in its security. As a fully managed service, it offers SSO integration, customer-managed encryption keys, compute-compute separation, and many other security-focused features. ClickHouse Cloud has undergone rigorous compliance audits, meeting high standards such as ISO 27001, SOC 2 Type 2, PCI DSS, and HIPAA, ensuring best-in-class security practices. For more details, visit our Trust Center.

Prevention and monitoring : What can we do as the security team?

Implementing best practices is a foundational step toward securing cloud-hosted databases, but without continuous monitoring, toxic combinations/misconfigurations can easily go unnoticed. Security teams can leverage Wiz to:

Surface risky configurations: Wiz automatically identifies exposed Clickhouse instances, misconfigured access controls, and overly permissive network settings across cloud environments. Here are two quick steps you can take in Wiz to surface these insights.
- Click here to find your instances of Clickhouse.

(Navigate to Inventory and under technologies you can search for ‘Clickhouse’ )

Click here to identify any publicly exposed Clickhouse instances.
Click here to identify if you have a misconfiguration that allows remote unauthenticated access to the Clickhouse database API.

(Navigate to Policy - Host Configuration Rules and search for Clickhouse). The ‘Clickhouse allows remote anonymous access to API database’ rule checks if there is a misconfiguration allowing remote unauthenticated access to their database API)

Prioritize risk based on context: Wiz enriches security findings with context from cloud infrastructure, helping teams focus on the risks that pose the greatest threat to sensitive data.
Monitor changes over time: Cloud environments are dynamic, with configurations constantly evolving. Wiz provides continuous assessment to help teams catch newly introduced risks before they lead to exposure.
Enable proactive remediation: Security teams can set up automated alerts and remediation workflows in Wiz to quickly address high-risk findings.

By integrating Wiz into their cloud security operations, organizations can build a scalable approach to protecting Clickhouse deployments and other cloud-hosted data services. Combining best practices with continuous risk monitoring ensures that security doesn't stop at configuration—it becomes an ongoing process of identifying, prioritizing, and mitigating risks in cloud environments.

See Wiz in action

Discover how Wiz can help you secure your cloud environments including your data services. Request a personalized demo to address your specific requirements.

Get a demo

Securing Cloud Databases: Best Practices with ClickHouse and Wiz