this post was submitted on 11 Dec 2023
-13 points (24.0% liked)

Technology

59525 readers
3457 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

So many prompts I have tried yet the results from Bing are always pretty weak compared to the real gpt-4. I also prompted it to write some Russian poems, so far it only spewed out gibberish with no rhymes. On the other hand, the real gpt-4 can sometimes produce really impressive shit, that I read with interest.

Another thing I noticed is that if I try to get Bing to generate something inappropriate, it'll go along and do it for a second, but then it'll quickly wipe its message. That's interesting because it suggests that the underlying model isn't the same as OpenAI's, which seems unable to generate harmful content at the core.

top 5 comments
sorted by: hot top controversial new old
[–] Ghostalmedia@lemmy.world 17 points 11 months ago

It’s just a different implementation of the model. MS has a bunch of shit layered on top of it. Web search, legal shit, etc. It’s going to product different results.

Just because you stick the same engine in a sedan and an SUV, that doesn’t mean the cars will drive the same.

[–] habanhero@lemmy.ca 10 points 11 months ago (1 children)

it'll go along and do it for a second, but then it'll quickly wipe its message. That's interesting because it suggests that the underlying model isn't the same as OpenAI's

Obviously I can't speak to Bing's claim on GPT4, but the behavior you saw does not necessarily have to do with the model. There are many ways the "chatbots" on the same model could behave differently, either by defining those behaviors programmatically or via context and prompts.

[–] forrgott@lemm.ee 3 points 11 months ago (1 children)

Something more to think about: how does the model "know" if it's generating prohibited content before it starts generating.

[–] AtmaJnana@lemmy.world 2 points 11 months ago

They likely scan the prompt as well as the output.

[–] AtmaJnana@lemmy.world 2 points 11 months ago

Software usually isn't monolithic. And this software, in particular, is way more complicated than you give it credit for. Consequently, you overlook many variables that would effect your casual testing.