I skimmed through the llama 2 research paper, there were some sections about them working to prevent users from circumventing the language model's programming. IIRC one of the examples of model hijacking was to disguise the request as a creative/fictional prompt. perhaps it's some part of that training gone wrong.
LocalLLaMA
Community to discuss about LLaMA, the large language model created by Meta AI.
This is intended to be a replacement for r/LocalLLaMA on Reddit.
Just goes to show the importance of being able to produce uncensored models.
It's an atheist Llama.
Try and ask it about Polish notation and then prompt it to solve: 3 3 +
We argued a bit. Ms. Example 7Bitchs earned her new name. 13B was less argumentative about corrections, but I couldn't find an angle that coaxed correct responses. Early computing languages handled Polish notation much better because it is stack based and linear without arbitrary rules. Also from the early years of programming, I am really surprised that no one has been training a model to code in a threaded interpreted language like Forth because it is super powerful, flexible with far fewer rules and arbitrary syntax, but most importantly, it is linear and builds exponentially. It's core building mechanic is already tokenized.