Here be RED Dragons: Generative AI Risks for Criminal Misuses
PhD project
PhD student:
Supervisors:
Björn Ross (School of Informatics), Alex Taylor (School of Informatics)
Outputs from this project
Forthcoming!
The rapid development of generative models has created new opportunities for misuse, presenting significant challenges for academics and practitioners. One of the most pressing issues lies in systematically identifying weaknesses in these models and understanding how such vulnerabilities relate to real-world criminal risks. An increasingly recognised approach in AI safety involves red-teaming methods — structured testing designed to probe and override model safeguards.
Red Dragons aims to help address this challenge by collaborating directly with national and international organisations, including law-enforcement experts, to investigate how generative models can be misused in the context of criminal abuse, including child abuse, and how systematic red-teaming methodologies and related benchmarks can be developed to strengthen model safety and trustworthiness.
Collaborators: Dr Rebecca Portnoff, Dan Sexton (CTO of IWF), OCCIT
Funder: UKRI AI Centre for Doctoral Training in Responsible and Trustworthy in-the-world Natural Language Processing
Project dates: 2024 –






