Most companies have implemented protocols for when an employee emails confidential information to the wrong person. A new version of that problem occurs when an employee uploads sensitive information to a consumer (i.e., not enterprise) AI tool, which gives rise to the following questions:
- Can the data be clawed back or deleted, and if so, how?
- Can humans at the AI provider view the sensitive data?
- Can other users of the AI tool find the sensitive data?
- Have any contractual notification obligations been triggered?
- Have any regulatory data breach notification obligations been triggered?
In this Debevoise Data Blog post, we answer each of these questions.
A. Fact Gathering
Data: The nature of the uploaded data will impact notification obligations and overall risk. Did it include:
- Material nonpublic or sensitive client information;
- Sensitive company information; or
- Sensitive personal information of employees, clients, or third parties (including each of the data elements, like name, phone number, social security number, etc.)?
Quickly identifying the affected data is a key initial step so that the company can determine the best containment measures and assess any potential notification obligations associated with the event.
AI Tool: Each AI provider is different in how it treats confidential information that is uploaded to its consumer tools, and often there are differences among the various tools from the same provider. For example, Anthropic’s Claude models generally do not train on user data by default, while DeepSeek’s models do. Knowing exactly which model was involved will allow the company to review the applicable website, terms of use, FAQs, security policies, and privacy notices. These will help guide the company to determine what data is used for training by default, how long data is maintained, when it is deleted, what will trigger a deletion, who to call or email about accidental upload, and what information the company should be prepared to provide.
B. Reducing Risk
Having gathered the necessary information, the company should take steps to contain the impact of the event. Depending on the platform involved, the data affected, and the potential obligations, a company may consider:
- Account settings adjustments. The company should work with the employee to determine what privacy and security controls the employee had enabled on their account and choose the most secure options, if they were not already selected. Changing these settings may result in the data not being used for analytics and model training purposes.
- Deletion. Depending on the platform, deleting a user’s interaction history (and in some cases, the entire account) will delete the uploaded data from the provider’s servers. AI providers’ policies often contain details about how the deletion process works and how long it may take for the deletion to occur.
- Outreach. The company should consider whether contacting the AI tool’s provider will help delete the uploaded data. The decision to reach out should depend on where the provider is located and whether they have some formal or informal process for notification and remediation of accidentally uploaded confidential data.
C. Notifications
Whether the event triggers contractual or regulatory notification obligations depends on several factors, including:
- Whether personal information was exposed, and if so, what kinds of personal information, and the location of the exposed individuals;
- Whether there is a credible risk of harm; and
- The specific language of the relevant contractual or regulatory provision.
Data Exposed. For regulatory breach notification, assessing the nature of the data exposed is a familiar exercise, which focuses mostly on sensitive personal information. By contrast, contractual notification obligations are usually triggered by unauthorized access to any confidential corporate information that was provided by the third party.
Unauthorized Access. Most regulatory and contractual notification obligations are only triggered when there has been unauthorized access to (and, in some cases, acquisition of) the confidential data. When the data is uploaded to a consumer version of an AI tool, there may be no reason to believe that any human at the AI provider or anywhere else will ever see that data.
Risk of Harm. Similarly, many state and federal notification obligations are only triggered if there is a reasonable risk of harm. This is a condition that may not be satisfied in many cases of unauthorized uploads to an AI system because the risk that input data will be reviewed or exposed to any third party is extremely limited for many AI tools.
D. Key Takeaways
1. Plan. Companies should consider preparing for an AI data leakage event in the same way they might prepare for misdirected emails or other types of data loss, including determining how they would classify such an event under their cyber incident response plans and who would lead the response.
2. Take Preventative Action. Employees may upload sensitive data to consumer AI tools because they are confused and mistakenly believe that they are using the enterprise version of the tool. To prevent this confusion, companies may consider disabling access to the consumer versions of their Enterprise AI tools. They may also consider deploying data loss prevention tools that prevent sensitive information from leaving the company’s servers.
3. Review Contracts. Companies should, with this kind of event in mind, review the breach notification provisions of their third-party contracts and consider whether it may be appropriate to amend any language going forward.
4. Conduct a Post-Mortem. Like with a cybersecurity incident, taking a step back after responding to an accidental AI upload, and looking for potential enhancements to policies, procedures, controls, or training, can help prevent such an event from happening and better mitigate the impact if it does.
***
To subscribe to the Data Blog, please click here.
The cover art used in this blog post was generated by ChatGPT.