this post was submitted on 22 Jul 2023
49 points (100.0% liked)

Technology

23 readers
2 users here now

This magazine is dedicated to discussions on the latest developments, trends, and innovations in the world of technology. Whether you are a tech enthusiast, a developer, or simply curious about the latest gadgets and software, this is the place for you. Here you can share your knowledge, ask questions, and engage in discussions on topics such as artificial intelligence, robotics, cloud computing, cybersecurity, and more. From the impact of technology on society to the ethical considerations of new technologies, this category covers a wide range of topics related to technology. Join the conversation and let's explore the ever-evolving world of technology together!

founded 2 years ago
 

Reddit usernames like ‘SolidGoldMagikarp’ are somehow causing the chatbot to give bizarre responses.

you are viewing a single comment's thread
view the rest of the comments
[–] saucyloggins@lemmy.world 8 points 1 year ago (3 children)

I’m not surprised they used Reddit data to train. I am shocked a bit at how fucking lazy and haphazard they were with the data.

There’s only logical arguments for anonymizing the data which it looks like they didn’t do. It’s such a massive privacy risk not to. It also puts the company at legal risk. Who knows what other bizarre info it’ll leak.

[–] fartsinger@kbin.social 7 points 1 year ago (1 children)

Bro do you not understand what a public Internet forum is?

[–] VanillaGorilla@kbin.social 2 points 1 year ago (1 children)

Yeah, right? Reddit isn't private like Lemmy or kbin. I'd be shocked to know this comment would be out in the open, but here it's absolutely safe.

[–] fartsinger@kbin.social 2 points 1 year ago

Yeah we use TLS, everything's encrypted!

[–] VelociCatTurd@lemmy.world 4 points 1 year ago

What is there to leak? This was likely public information.

[–] FaceDeer@kbin.social 2 points 1 year ago

The silliness of anonymizing data that's already wide open in the public aside, if you were to anonymize the usernames you'd end up producing a worse AI because often the literal username of the person in question is significant to the context of what's being written. Think of all the "relevant username" comments, for example. People make puns about usernames, berate people for having offensive usernames, and so forth. If those usernames were all replaced with anonymized substitutes the AI would be training on nonsense.