I sorta misread your post, these bots can indeed be twisted, or “jailbroken” during conversation, to a pretty extreme extent. The error is assuming they are objective in the first place, I suppose.
Base models are extremely interesting to play with, as they haven’t been tuned for conversation or anything. They do only one thing: complete text blocks, thats it, and it is fascinating to see how totally “raw” LLMs trained only on a jumble of data (before any kind of alignment) guess how text should be completed. They’re actually quite good for storytelling (aka completing long blocks of novel-format text) because they tend to be more “creative,” unfiltered, and less prone to gpt-isms than the final finetuned models. And instead of instructing them how to write, they only pick it up from the novel’s context.
Yeah.
I sorta misread your post, these bots can indeed be twisted, or “jailbroken” during conversation, to a pretty extreme extent. The error is assuming they are objective in the first place, I suppose.
Base models are extremely interesting to play with, as they haven’t been tuned for conversation or anything. They do only one thing: complete text blocks, thats it, and it is fascinating to see how totally “raw” LLMs trained only on a jumble of data (before any kind of alignment) guess how text should be completed. They’re actually quite good for storytelling (aka completing long blocks of novel-format text) because they tend to be more “creative,” unfiltered, and less prone to gpt-isms than the final finetuned models. And instead of instructing them how to write, they only pick it up from the novel’s context.