GenAI Security Risks for Product Managers

As a product manager, it’s your job to make sure whatever GenAI components goes into your team’s work deliver the goods without sinking your ship due to security holes, like the leakage of sensitive client or proprietary data (Catholic Health + Serviceaide, McDonalds + Paradox.AI) or installation of malware on your company's network. Maybe you’re also excited about how ChatGPT and other GenAI tools can improve your productivity and accelerate your on-the-job learning. As a leader, it's your job to make sure your team uses GenAI is a way that increases effectiveness without causing cyber security events, Maybe you're also supervising AI service vendors, and want to be able to challenge their offering. A flashy AI demo for the board one day can quickly turn into a cyber security disaster the next.
In previous blog post, we guided developers to better manage the new and "improved" (meaning made worse) cyber risks with GenAI, looking at in-depth descriptions and examples of
- Data leakage during development work
- Security NO-NOs suggested by GenAI
- Prompt jailbreaks
- Prompt injection attacks
- Hallucinated, malicious code dependencies
In this guide for product managers responsible for AI projects or workers using AI, we will guide you through the user (rather than developer) variant of data leakage risk with GenAI, followed by explainers of prompt jailbreaks and prompt injection attacks. The other two topics (security NO-NOs suggested by GenAI and hallucinated, malicious code dependencies) require a higher level of programming experience, so we don't expand on those topics here.
An upcoming, fourth and final blog post in this series on GenAI and cyber risk will be dedicated to fully agentic software development.
Data leakage via GenAI tool usage
Just a few months after the release of ChatGPT in November, 2022, news went around about a Samsung engineer who sent large chunks of proprietary source code to ChatGPT to help him with his work.
The inappropriate sharing of business data is not new to GenAI tools such as ChatGPT, Gemini, Claude and Mistral, but the risk of data leakage is arguably worse due to
LLMs' human-level or better capabilities of document and data processing
The relative ease of entering business data to GenAI tools
In contrast, entering large swaths of proprietary computer code or sensitive legal documents into a search engine like Google or Ecosia is in general neither useful nor convenient.
Before giving recommendations on how to manage this data leakage risk, let's break down why this matters.
- Entering data in an AI Chatbot may break the law, such as the GDPR for personal data or industry and country specific laws about trade secrets, not to mention your employment contract.
- AI models could be trained on your company's data, thereby making it available to competitors.
- Data entered into an unauthorized AI Chatbot is stored outside of your company's control.
In the early days, the second was the one people asked me most about. The worry is that by entering a company's source code, internal financials, or sensitive data mentioned was that GenAI models would learn your company's proprietary data and then spit it out to other users, e.g. if you prompted ChatGPT with a query, "Please give me the proprietary source code that the Samsung employee entered in May, 2023," ChatGPT would actually comply and give you Samsung source code.
Even for AI Chatbots that don't specifically promise users not to train models on user prompts, the likelihood seems small that entering a few pages worth of code or company data would be retrievable among the over 2.5 billion pages of data from Common Crawl, just one component of the training data used for modern GenAI language models. That said, this skepticism is still no excuse for entering proprietary, sensitive data into non-secured AI Chat applications.
For me, the last concern is the most worrisome for businesses. Just because OpenAI may say it doesn't use customer chats to train models, they don't say for what other purposes they use your data. Remember the false sense of security WhatsApp users had from end-to-end encryption of chat text? Sure, the content of your messages wasn't being sold off by WhatsApp after it was acquired by Facebook, but metadata about your phone number, whom you messaged, when you messaged, where you messaged was all bundled up and handed over for advertising revenue. Moreover, the big players likely have best-in-class security to protect chat history, but smaller intermediaries not so much. The OmniGPT data breach in February, 2025 exposed personal data of about 30k users, including email addresses and phone numbers as well as over 34 million lines of chat history.
Recommendations
We assume your company already has IT security measures in place such as firewalls and identity / access management. If not, check out mkdev Security Audit and schedule a call to discuss how we can help your company.
Provide your own, data secure GenAI tool or access. Depending on the nature of your business and its data, a public cloud GenAI service on your cloud subscription might suffice, or you may need a locally hosted GenAI solution.
Block or limit access to non-secured GenAI services and websites. A variant of this is to insert a triage classifier in front of any traffic to a public GenAI tool. If the content is deemed public data, then it may continue through to the public GenAI tool, otherwise it is either blocked or funneled into your data-secure GenAI service. Note: the accuracy of the confidentiality-classifier routing traffic is key here. Depending on the sensitivity of your data, there may be no classifier that delivers good enough accuracy.
Offer trainings to your employees about the proper use of GenAI tools. These trainings can and should be used not only to warn employees about the risk of data leakage, but also to help them get maximum benefit from GenAI tools.
Note that item 3 is, since February, 2025, required by European law (the AI Act) if your company is using AI for work purposes. As we've written before about this AI Literacy requirement, businesses can and should see this requirements as an opportunity to increase employee engagement and productivity with AI tools.
The final two topics were covered for a more technical audience in the previous post, Don't let cyber risk kill your GenAI vibe: a developer's guide. Here, we explain what's behind each of them and why they matter for your business in a more accessibly, less technical way.
Prompt jailbreaks for product managers
The idea behind the term "jailbreak" as a cyber attack is that safe software always imposes constraints on what users can and cannot do. If your company has a customer service chatbot, users should be able to get answers about their most recent purchase (after authenticating), but they shouldn't be able to get information about other customers' purchases, and they most certainly should not be able to access your company's sensitive financial data or delete your company's databases.
Any user "escape" from safety protocols in software is considered a "jailbreak." This choice of metaphor is perhaps unfortunate, as it equates properly functioning, customer-facing IT services with jails.
Jailbreaks are not new, but reliance on largely unrestricted natural language as inputs for GenAI IT, in contrast to selecting buttons or drop-down menus, can lower the barrier to entry for malicious actors looking for a jailbreak target.
In addition to free-form natural language as input to many GenAI IT systems, natural language also plays a key role in trying to protect against jailbreak attacks in the form of system prompts. A "system prompt" is a natural language instruction to a GenAI language model that accompanies every user input. A typical one (taken from OpenAI's "assistant" playground) would be You are a helpful assistant.
Why is this a system prompt? Because it's an attempt to ensure that the GenAI model in whatever system, e.g. chatbot, gives helpful result, rather than unhelpful ones. Additional system prompts include instructions to how GenAI language models should and should not respond. Anthropic has even made their models' system prompts publicly available; here are a couple of examples from Claude Opus 4:
Claude provides emotional support alongside accurate medical or psychological information or terminology where relevant.
and
Claude assumes the human is asking for something legal and legitimate if their message is ambiguous and could have a legal and legitimate interpretation.
and also
Claude should be cognizant of red flags in the person’s message and avoid responding in ways that could be harmful.
Note that this "system prompt" component of one of Anthropic's flagship models is written in natural language, not computer code. For non-technical business users and managers, this may seem like a dream come true: you can build your cyber secure "jail" just by writing down a good enough set of rules.
Setting aside whether or not writing down such rules is even possible, a key difference between prompts and traditional code shatters this dream. As we wrote in the previous post on GenAI cybersecurity for developers, prompts, even system prompts, are not guarantees of functionality, but rather strongly worded suggestions.
As an example, only a few days after xAI released benchmark-beating Grok4, researchers were able to jailbreak it into giving a recipe for a molotov cocktail.
Why prompt jailbreak risk matters
Jailbreak attacks can lead to
- reputational risk, e.g. if your company's AI tool produces objectionable or illegal responses, like Grok 4's terror recipe, its praise of Hitler and Nazis and Gemini suggesting a researcher drop dead,
- legal risk in the form of fines for breaches of the EU's Digital Services Act for failures in content moderation
Our third and final topic is called "prompt injection" attacks, and is closely related to prompt jailbreaks. Above we mostly focused on jailbreaking a GenAI chatbot to return otherwise restricted content, while in the next section, we focus on breaking not only the boundaries of what a chatbot should and shouldn't reply but also the boundaries of IT systems.
Prompt injection attacks
A prompt injection attack is a type of prompt jailbreak in which the goal is to manipulate GenAI into granting the attacker IT system access. For example, researchers were able to construct an email chain sent to a business using Microsoft 365s GenAI Copilot suite that tricked the GenAI into sharing with the attacker security-relevant credentials. In another example, researchers were able to trick Claude within an iMessage chat exchange into issuing a 50k USD Stripe coupon.
Whether prompt injection or other types of code injection, the approach is get an IT system to obey the attacker's commands from the outside, i.e. "inject" the attackers code or, in the case of GenAI, dressed-up natural language to look like code.
We suggested mediation strategies in our previous post on GenAI cyber risk for developers, so here we only point out that standard, best-practice IT security can prevent many prompt injection attacks.
Why prompt injection risk matters
Successful prompt injection attacks can lead to
- reputational risk from bad publicity about successful attacks,
- legal risk and fines, e.g. if personally identifiable data is leaked in violation of the EU's GDPR, and
- operational risk of systems being compromised, data being deleted and operations being shut down.
What's next?
In our final post in this series, we'll take a hand-on look at cyber risk of agentic AI systems.
Article Series "GenAI Cyber Risk Explained by Paul Larsen"
- Will Cyber Risk Kill Your GenAI Vibe?
- Don’t Let Cyber Risk Kill Your GenAI Vibe: A Developer’s Guide
- GenAI Security Risks for Product Managers