Wiz collaborates with NVIDIA to advance ML research for data classification

Wiz Research taps Llama 3 model NVIDIA NIM microservices for sensitive data classification

3 minutes read

GenAI and machine learning can analyze vast amounts of data in real-time, identify patterns and anomalies, and help with more efficient and data-driven decision making. Many security organizations are already leveraging the power of GenAI through AI-powered security to help increase efficiency, accuracy, and upskill their teams. You can find examples of how you can leverage AI-powered security in your organization in this blog. Wiz utilizes generative AI as part of the Ask AI capabilities, including  AI-powered remediation, customization of Rego policies using AI, and AI graph query, providing natural language to graph query conversion. To further explore the power of GenAI, the Wiz Research Team experimented with a new AI-powered security use case for ML data classification, by leveraging NVIDIA AI. This blog explores how the Wiz Research Team explored using NVIDIA NIM for Meta Llama 3 models for data classification, with the goal of improving the accuracy and efficiency of sensitive database field identification. 

NVIDIA NIM brings  new levels of performance and efficiency to deploying open LLM, streamlining data-driven tasks, such as data classification. Our collaboration with  Wiz demonstrates how AI through NIM can reshape data security—enabling organizations to identify and secure their sensitive information with unprecedented precision and speed.

Bartley Richardson, Director of Engineering at NVIDIA

Machine Learning-based data classification 

In today’s data-centric world, securing sensitive information is crucial, especially for businesses that handle large volumes of personally identifiable information (PII), financial data, and other confidential assets. Wiz DSPM helps organizations protect their cloud data by discovering and classification sensitive data and assessing for data risks across their cloud footprint.   

ML can have great benefits when it comes to data security as well. AI models can easily see the bigger picture to understand context and identify patterns of potential sensitive data easily. In our experiment, we leveraged LLM to try to understand if a database contains sensitive data based on the database name, table, and tables’ schemas only. We then tried to identify what is the purpose of the database, for example, a database that is used to store application-related data or financial data, etc. Using LLM we were able to analyze all the information to classify if the database contains sensitive data. We also used LLM to describe what the database actually contains- for example, if it is specific application data (like WordPress), application logs, or CRM data. 

Leveraging NVIDIA NIM for running Llama models 

NVIDIA NIM, part of the NVIDIA AI Enterprise software platform, is a set of easy-to-use inference microservices for secure, reliable deployment of high-performance AI model inferencing. It allows organizations to innovate with AI faster by providing containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models. In collaboration with NVIDIA, the Wiz Research Team explored using NVIDIA NIM for data classification. We used NIM microservices for Llama 3 models on AWS through the AWS Marketplace. 

AI can have great benefits for data security use cases. By using NVIDIA NIM, we saw improved efficiency and accuracy of data classification and look forward further collaboration with NVIDIA

Erez Harus, AI Research Lead at Wiz 

With easy setup and in less than one hour, we were able to deploy the Llama model using NVIDIA NIM with an OpenAI-compatible API in the most efficient way. According to our tests, the performance on NVIDIA NIM was 2x better than running the model as is, and 25% better than running on a popular open source framework on a single GPU. We were able to process more data faster on the same number of resources and easily scale by updating the NIM configuration to run multiple models on the same machine. For example, processing 1,000 samples of data took us on average 1.4 seconds with NIM, while it took 1.8 seconds with popular open source inferencing software, and almost 3 seconds with running the model as is. NIM was also easy to scale – we could run a few NIM instances on a stronger machine, allowing us to process data quickly. 

We look forward to further exploring AI-powered security use cases and collaborating with NVIDIA. Learn more about NVIDIA NIM and Wiz DSPM. If you would like to see Wiz in action, we would love to connect with you over a live demo

Continue reading

Get a personalized demo

Ready to see Wiz in action?

“Best User Experience I have ever seen, provides full visibility to cloud workloads.”
David EstlickCISO
“Wiz provides a single pane of glass to see what is going on in our cloud environments.”
Adam FletcherChief Security Officer
“We know that if Wiz identifies something as critical, it actually is.”
Greg PoniatowskiHead of Threat and Vulnerability Management