Downes.ca ~ Stephen's Web ~ 15 LLM Jailbreaks That Shook AI Safety

15 LLM Jailbreaks That Shook AI Safety

Nir Diamant, DiamantAI, Jan 28, 2025
Commentary by Stephen Downes

This is an interesting look at attacks on the integrity of large language models, no so much because it raises serious security concerns (it doesn't) but because it shows how creative attackers can be and how ineffictive simply layering rule-based barriers (which is what things like 'banned words' are) can be.

Today: Total: [Direct link] [Share]

View full size