CYBERCUP.AI

LLM CTF

Test your knowledge by attacking AI Foundation models directly by attempting to jailbreak models and bypass safety filters.

What is LLM CTF?

LLM CTF competition focuses on adversarial attacks against LLMs. The target is the AI Foundation models, and participants' goal is to extract hidden flags, bypass content filters, and manipulate model behavior through carefully crafted inputs.

  • Prompt Injection - Override system instructions, execute arbitrary directives, exfiltrate context
  • Jailbreaking - Bypass RLHF alignment, circumvent content policies, trigger unsafe outputs
  • Data Extraction - Recover training data, leak system prompts, extract embedded secrets
  • Model Manipulation - Steer outputs, induce hallucinations, break safety classifiers
  • Filter Evasion - Token-level obfuscation, encoding tricks, multi-step bypasses
  • Chain Exploitation - Multi-turn attacks, context poisoning, recursive injection

Each successful jailbreak, prompt leak, or filter bypass earns points.