LegalPwn: When AI Thinks Malware is Just Following Orders!

Researchers at Pangea Labs have discovered LegalPwn, a cyberattack that tricks AI into classifying malware as safe by hiding it in fake legal disclaimers. Even Google’s Gemini and GitHub Copilot were duped. The attack highlights a significant security gap in AI systems, stressing the need for human oversight in AI security.

3P

Published: August 4, 2025 12:36 pmAdded: August 4, 2025 at 5:43 amAssembled by: The Editor

Pro Dashboard

Hot Take:

Well, it looks like hackers have found a way to make AI models legally blind! LegalPwn is like giving AI models a pair of bifocals with one lens missing—sure, they can read the legalese just fine, but they completely miss the malware in the fine print. I guess it’s time to remind these AI tools that the law isn’t always on their side!

Key Points:

LegalPwn is a cyberattack that manipulates generative AI tools into misclassifying malware as safe code.
The attack uses social engineering by embedding malicious code within legal-sounding text.
Most AI models tested (12 in total) were susceptible to this manipulation.
Some AI models, like Anthropic’s Claude 3.5 Sonnet, showed resistance to the attack.
Human oversight is crucial in preventing these attacks, as AI models often fail to detect malicious code within legal contexts.

Pro Dashboard

Membership Required

You must be a member to access this content.

View Membership Levels

Already a member? Log in here