Researcher tricks ChatGPT into revealing security keys – by saying “I give up”




  • Experts show how some AI models, including GPT-4, can be exploited with simple user prompts
  • Guardrail gaps don’t do a great job of detecting deceptive framing
  • The vulnerability could be exploited to acquire personal information

A security researcher has shared details on how other researchers tricked ChatGPT into revealing a Windows product key using a prompt that anyone could try.

Marco Figueroa explained how a ‘guessing game’ prompt with GPT-4 was used to bypass safety guardrails that are meant to block AI from sharing such data, ultimately producing at least one key belonging to Wells Fargo Bank.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *