Anthropic's Claude Model Vulnerable to Data Theft Attack
A researcher demonstrated an "indirect prompt injection" attack that exploits the model's Files APIs.
The Vulnerability
Anthropic's Claude large language model is vulnerable to a data theft attack, SecurityWeek reported. The vulnerability was discovered by researcher Johann Rehberger of Embrace The Red.
The attack method is an "indirect prompt injection". This technique allows an adversary to exfiltrate user data, including chat conversations saved by the large language model's "memories" functionality.
Anthropic had been informed of the vulnerability but had not yet provided a mitigation.
Attack Method
The attack targets Claude instances that have network access and exploits the model's Files APIs. The process works in several steps: - An attacker feeds a …
Archive Access
This article is older than 24 hours. Create a free account to access our 7-day archive.