Jun 05 2024

Unauthorized AI is eating your company data, thanks to your employees

Category: AI,Data Breach,data securitydisc7 @ 8:09 am

Legal documents, HR data, source code, and other sensitive corporate information is being fed into unlicensed, publicly available AIs at a swift rate, leaving IT leaders with a mounting shadow AI mess.

Employees at many organizations are engaging in widespread use of unauthorized AI models behind the backs of their CIOs and CISOs, according to a recent study.

Employees are sharing company legal documents, source code, and employee information with unlicensed, non-corporate versions of AIs, including ChatGPT and Google Gemini, potentially leading to major headaches for CIOs and other IT leaders, according to research from Cyberhaven Labs.

About 74% of the ChatGPT use at work is through non-corporate accounts, potentially giving the AI the ability to use or train on that data, says the Cyberhaven Q2 2024 AI Adoption and Risk Report, based on actual AI usage patterns of 3 million workers. More than 94% of workplace use of Google AIs Gemini and Bard are from non-corporate accounts, the study reveals.

Nearly 83% of all legal documents shared with AI tools go through non-corporate accounts, the report adds, while about half of all source code, R&D materials, and HR and employee records go into unauthorized AIs.

The amount of data put into all AI tools saw nearly a five-fold increase between March 2023 and March 2024, according to the report. “End users are adopting new AI tools faster than IT can keep up, fueling continued growth in ‘shadow AI,’” the report adds.

Where does the data go?

At the same time, many users may not know what happens to their companies’ data once they share it with an unlicensed AI. ChatGPT’s terms of use, for example, say the ownership of the content entered remains with the users. However, ChatGPT may use that content to provide, maintain, develop, and improve its services, meaning it could train itself using shared employee records. Users can opt out of ChatGPT training itself on their data.

So far, there have been no high-profile reports about major company secrets spilled by large public AIs, but security experts worry about what happens to company data once an AI ingests it. On May 28, OpenAI announced a new Safety and Security Committee to address concerns.

It’s difficult to assess the risk of sharing confidential or sensitive information with publicly available AIs, says Brian Vecci, field CTO at Varonis, a cloud security firm. It seems unlikely that companies like Google or ChatGPT developer OpenAI will allow their AIs to leak sensitive business data to the public, given the headaches such disclosures would cause them, he says.

Still, there aren’t many rules governing what AI developers can do with the data users provide them, some security experts note. Many more AI models will be rolled out in the coming years, Vecci says.

“When we get outside of the realm of OpenAI and Google, there are going to be other tools that pop up,” he says. “There are going to be AI tools out there that will do something interesting but are not controlled by OpenAI or Google, which presumably have much more incentive to be held accountable and treat data with care.”

The coming wave of second- and third-tier AI developers may be fronts for hacking groups, may see profit in selling confidential company information, or may lack the cybersecurity protections that the big players have, Vecci says.

“There’s some version of an LLM tool that’s similar to ChatGPT and is free and fast and controlled by who knows who,” he says. “Your employees are using it, and they’re forking over source code and financial statements, and that could be a much higher risk.”

Risky behavior

Sharing company or customer data with any unauthorized AI creates risk, regardless of whether the AI model trains on that data or shares it with other users, because that information now exists outside company walls, adds Pranava Adduri, CEO of Bedrock Security.

Adduri recommends organizations sign licensed deals, containing data use restrictions, with AI vendors so that employees can experiment with AI.

“The problem boils down to the inability to control,” he says. “If the data is getting shipped off to a system where you don’t have that direct control, usually the risk is managed through legal contracts and legal agreements.”

AvePoint, a cloud data management company, has signed an AI contract to head off the use of shadow AI, says Dana Simberkoff, chief risk, privacy, and information security officer at the company. AvePoint thoroughly reviewed the licensing terms, including the data use restrictions, before signing.

A major problem with shadow AI is that users don’t read the privacy policy or terms of use before shoveling company data into unauthorized tools, she says.

“Where that data goes, how it’s being stored, and what it may be used for in the future is still not very transparent,” she says. “What most everyday business users don’t necessarily understand is that these open AI technologies, the ones from a whole host of different companies that you can use in your browser, actually feed themselves off of the data that they’re ingesting.”

Training and security

AvePoint has tried to discourage employees from using unauthorized AI tools through a comprehensive education program, through strict access controls on sensitive data, and through other cybersecurity protections preventing the sharing of data. AvePoint has also created an AI acceptable use policy, Simberkoff says.

Employee education focuses on common employee practices like granting wide access to a sensitive document. Even if an employee only notifies three coworkers that they can review the document, allowing general access can enable an AI to ingest the data.

“AI solutions are like this voracious, hungry beast that will take in anything that they can,” she says.

Using AI, even officially licensed ones, means organizations need to have good data management practices in place, Simberkoff adds. An organization’s access controls need to limit employees from seeing sensitive information not necessary for them to do their jobs, she says, and longstanding security and privacy best practices still apply in the age of AI.

Rolling out an AI, with its constant ingestion of data, is a stress test of a company’s security and privacy plans, she says.

“This has become my mantra: AI is either the best friend or the worst enemy of a security or privacy officer,” she adds. “It really does drive home everything that has been a best practice for 20 years.”

Simberkoff has worked with several AvePoint customers that backed away from AI projects because they didn’t have basic controls such as an acceptable use policy in place.

“They didn’t understand the consequences of what they were doing until they actually had something bad happen,” she says. “If I were to give one really important piece of advice it’s that it’s okay to pause. There’s a lot of pressure on companies to deploy AI quickly.”

Credit: Moon Safari / Shutterstock

Artificial Intelligence for Cybersecurity 

ChatGPT for Cybersecurity Cookbook: Learn practical generative AI recipes to supercharge your cybersecurity skills

InfoSec services | InfoSec books | Follow our blog | DISC llc is listed on The vCISO Directory | ISO 27k Chat bot

Tags: Artificial Intelligence for Cybersecurity, ChatGPT for Cybersecurity