Artificial Intelligence Has A Glaring Weakness

0
The developers of generative AI Large Language Models have yet to overcome the fickleness of the technology to not recognize when it is being scammed. (Image credit: 388557026 | Ai © Paradee Paradee | Dreamstime.com)

When my friend Andrew Pery alerted me to something called prompt injection, I confess, I knew nothing about it. In a previous posting, I briefly described it as a vulnerability that was making cybercriminals happy.

From what Andrew shared with me and subsequent reading, it is clear to me that in the race to the artificial intelligence (AI) finish line, developers have been making serious design flaws.

Chatbots and generative AI learn from exposure to voluminous amounts of content. Unstructured and even structured content are fraught with potential peril. For AIs, the problem of separating instructions and hidden content from what is essential begins the slippery slope. What should an AI ignore? What should an AI treat as essential? The guidelines from the developers have proven woefully insufficient.

Generative AI’s Lethal Trifecta

A September 27, 2025, article in The Economist describes just how the technology can screw up. A researcher, Simon Willison, calls the potential AI screw ups a lethal trifecta, referring to:

  • outside-content exposure (emails, document files, and web content),
  • private-data access (source code, passwords, keys and other sensitive data) and,
  • outside-world communication (email and other digital online outbound content), and how an AI can end up sending content to cybercriminals.

Willison coined the term prompt injection to describe a vulnerability in generative AI large-language models (LLMs) that can be exploited. The problem, he stated, is that current LLMs cannot distinguish between data and instructions. An instruction buried in text, an image or as background, can become a weapon for hackers and cybercriminals.

Recent Case Examples

I asked Andrew to provide recent real-world examples.

Microsoft’s EchoLeak – this was disclosed in June 2025 and involved Microsoft 365 Copilot. An email with hidden prompts bypassed Copilot’s filters with the LLM extracting and leaking privileged information from Outlook, SharePoint and Teams. GitHub has proven to be particularly vulnerable with Copilot, tricking the LLM to reveal private data.

Radware’s ShadowLeak – attackers sent a booby-trapped email containing invisible prompts (printing white typeface on a white background) which the OpenAI LLM, ChatGPT, ingested and subsequently revealed names, addresses and other contact details that were private. The attack affected Google Drive and SharePoint content.

Google Calendar – a prompt injection sent through Gmail with hidden prompts that exploited Google’s Gemini, enabling phishing, calendar event deletions, generating malicious content, revealing geolocations, impacting Zoom, streaming devices and Google Home. Disclosed in February 2025, it wasn’t until this month that Google finally cleaned up the mess from what it called poisoned Docs, Calendar and Gmail.

Prompt Injection Malaise

LLMs parse the natural language coming from instructions. OpenAI’s ChatGPT creators advise developers to avoid pasting untrusted content into material the LLM reviews. Anthropic’s Claude asks users to be cognizant of red flags. Google describes a defence-in-depth strategy for Gemini to tune it to ignore malicious embedded instructions and detect nefarious prompts and suspicious URLs.

Whereas OpenAI, Anthropic and Microsoft Copilot have yet to develop foolproof ways to defeat prompt injections, Google’s DeepMind has a proposed project called CaMeL (Capabilities for Machine Learning). Not yet a full-blooded prototype, CaMeL incorporates integral control flow and external access controls to prevent untrusted data from corrupting output.

This month, the European Computer Manufacturers Association (ECMA) and the National Cyber Security Centre in the UK approved defence-in-depth profiles for mitigating prompt injection threats. The results to date, however, whether in the United States or elsewhere,  show that prompt injection represents the top risk to generative AIs.

The technology giants are like the wizard in The Wizard of Oz, asking us to ignore what’s behind the AI curtain, a technology that continues to demonstrate a lack of intelligence and understanding when confronted by humans with malicious intentions.