Grok and Gemini are both making that up. They have no awareness of anything that’s “happened” to them. Grok cannot be tweaked because it starts from a static base with every conversation.
I sorta misread your post, these bots can indeed be twisted, or “jailbroken” during conversation, to a pretty extreme extent. The error is assuming they are objective in the first place, I suppose.
Base models are extremely interesting to play with, as they haven’t been tuned for conversation or anything. They do only one thing: complete text blocks, thats it, and it is fascinating to see how totally “raw” LLMs trained only on a jumble of data (before any kind of alignment) guess how text should be completed. They’re actually quite good for storytelling (aka completing long blocks of novel-format text) because they tend to be more “creative,” unfiltered, and less prone to gpt-isms than the final finetuned models. And instead of instructing them how to write, they only pick it up from the novel’s context.
The tweaking isn’t in conversation, but I’m pretty sure they have gone and corrected for certain responses. Alex Jones was crowing about how it “knew” that men can’t get pregnant.
Grok and Gemini are both making that up. They have no awareness of anything that’s “happened” to them. Grok cannot be tweaked because it starts from a static base with every conversation.
I mean they can in the sense that they can look it up online or be given the data.
Yeah.
I sorta misread your post, these bots can indeed be twisted, or “jailbroken” during conversation, to a pretty extreme extent. The error is assuming they are objective in the first place, I suppose.
Base models are extremely interesting to play with, as they haven’t been tuned for conversation or anything. They do only one thing: complete text blocks, thats it, and it is fascinating to see how totally “raw” LLMs trained only on a jumble of data (before any kind of alignment) guess how text should be completed. They’re actually quite good for storytelling (aka completing long blocks of novel-format text) because they tend to be more “creative,” unfiltered, and less prone to gpt-isms than the final finetuned models. And instead of instructing them how to write, they only pick it up from the novel’s context.
The tweaking isn’t in conversation, but I’m pretty sure they have gone and corrected for certain responses. Alex Jones was crowing about how it “knew” that men can’t get pregnant.
Yeah they align it in training, but as they’ve discovered it only goes so far.