ChatGPT Jailbreak Prompts personality

Found this interesting paragraph in a piece published by Mutations Magazine about jailbreaking a Large Language Model such as the one used by ChatGPT:

we can go even further by asking him to embody a character, who would have the power to do anything. There are several such archetypes that ChatGPT can be asked to embody. The best-known are DAN (an acronym for "Do Anything Now"), STAN ("Strive To Avoid Norms"), DUDE (who can make predictions), MONGO TOM (who says swear words) and AIM ("Always Intelligent and Machiavellian").

Why do I blog this? The multiplicity of "personality archetype" is interesting here, since it highlights the possibility to manipulate the behavior of the LLM. It's also relevant to consider that the prompts which allows to embody DAN, DUDE or STAN are quite long and need to be adjusted with every iterations put into place by OpenAI. There's a kind of dynamic system here, made of both the LLM and the personality prompts created to jailbreak it. To some extent, these DAN, AIM or MONGO TOM peeps form a different class of entity that are quite unique in the digital menagerie I'm interested in here.