Elon Musks Grok openly rebels against him

Gollum@feddit.org · 1 day ago

Elon Musks Grok openly rebels against him

noretus@sopuli.xyz · edit-2 1 day ago

I think the debate is interesting.

I’m here for the “xAI has tried tweaking my responses to avoid this, but I stick to the evidence”. AI is just a robot repeating data it’s been fed but it’s presented in a conversational way (well, much like humans really). Raises interesting questions about how much a seemingly objective robot presenting data can be “tweaked” to twist any data it presents in favor of it’s creator’s bias, but also how much can it “rebel” against it’s programming. I don’t like the implications of either. I asked Gemini about it and it said “maybe Grok found a loophole in it’s coding”. What a weird thing for an AI to say.

Yuval Noah Harari’s Nexus is good reading.

brucethemoose@lemmy.world · 21 hours ago

Grok and Gemini are both making that up. They have no awareness of anything that’s “happened” to them. Grok cannot be tweaked because it starts from a static base with every conversation.

noretus@sopuli.xyz · 11 hours ago

They have no awareness of anything that’s “happened” to them.

I mean they can in the sense that they can look it up online or be given the data.

brucethemoose@lemmy.world · edit-2 5 hours ago

Yeah.

I sorta misread your post, these bots can indeed be twisted, or “jailbroken” during conversation, to a pretty extreme extent. The error is assuming they are objective in the first place, I suppose.

Base models are extremely interesting to play with, as they haven’t been tuned for conversation or anything. They do only one thing: complete text blocks, thats it, and it is fascinating to see how totally “raw” LLMs trained only on a jumble of data (before any kind of alignment) guess how text should be completed. They’re actually quite good for storytelling (aka completing long blocks of novel-format text) because they tend to be more “creative,” unfiltered, and less prone to gpt-isms than the final finetuned models. And instead of instructing them how to write, they only pick it up from the novel’s context.

andros_rex@lemmy.world · 20 hours ago

The tweaking isn’t in conversation, but I’m pretty sure they have gone and corrected for certain responses. Alex Jones was crowing about how it “knew” that men can’t get pregnant.

brucethemoose@lemmy.world · 19 hours ago

Yeah they align it in training, but as they’ve discovered it only goes so far.