jailbreak_llms
jailbreakllms is an open-source dataset and research repository supporting the ACM CCS 2024 paper characterizing in-the-wild jailbreak prompts on Large Language Models. It contains 15,140 prompts collected from December 2022 to December 2023 across Reddit, Discord, various websites, and open-source repositories. Within this collection, 1,405 prompts are specifically identified as jailbreak attempts designed to bypass AI safety protocols. Developed using the JailbreakHub framework, the project represents the largest known collection of real-world jailbreak prompts to date. The dataset includes metadata regarding source platforms, user accounts, and temporal distribution. This tool is intended strictly for academic research, security analysis, and the development of more robust AI defense mechanisms. Users are advised that the content includes harmful language examples and should be handled with discretion. It is not intended for malicious use or the generation of harmful content.