I skimmed through the llama 2 research paper, there were some sections about them working to prevent users from circumventing the language model's programming. IIRC one of the examples of model hijacking was to disguise the request as a creative/fictional prompt. perhaps it's some part of that training gone wrong.

[–] zephyrvs@lemmy.ml 4 points 1 year ago

Just goes to show the importance of being able to produce uncensored models.

[–] rufus@discuss.tchncs.de 2 points 1 year ago

It's an atheist Llama.

[–] j4k3@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

Try and ask it about Polish notation and then prompt it to solve: 3 3 +

We argued a bit. Ms. Example 7Bitchs earned her new name. 13B was less argumentative about corrections, but I couldn't find an angle that coaxed correct responses. Early computing languages handled Polish notation much better because it is stack based and linear without arbitrary rules. Also from the early years of programming, I am really surprised that no one has been training a model to code in a threaded interpreted language like Forth because it is super powerful, flexible with far fewer rules and arbitrary syntax, but most importantly, it is linear and builds exponentially. It's core building mechanic is already tokenized.

load more comments