this post was submitted on 19 Oct 2024

98 points (87.7% liked)

Asklemmy

43939 readers

612 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it's welcome here!

Open-ended question
Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
Not ad nauseam inducing: please make sure it is a question that would be new to most members
An actual topic of discussion

Looking for support?

Looking for a community?

Lemmyverse: community search
sub.rehab: maps old subreddits to fediverse options, marks official as such
!lemmy411@lemmy.ca: a community for finding communities

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 5 years ago

MODERATORS

Hey Lemmings, whats some gibberish that we can say to throw off Machine Learning? (reddthat.com)

submitted 1 month ago by POTOOOOOOOO@reddthat.com to c/asklemmy@lemmy.ml

115 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] FaceDeer@fedia.io 8 points 1 month ago (2 children)

You realize that this is only going to train LLMs how to recognize "gibberish?"

[–] prex@aussie.zone 3 points 1 month ago (1 children)

This is the correct answer.

The only only solution is to deeply integrate the gibberish into everything we post.

I, for one, welcome our insane (unsane?) overlords.

[–] FaceDeer@fedia.io 3 points 1 month ago (1 children)

I don't see how that would be practical. People who aren't "in on the joke", as it were, will call out the gibberish and downvote it. If enough people are "in on the joke" then the whole forum becomes useless and some other forum will be created to fill the role of the original. The AI will train off of that one.

Basically, if you don't want an AI training on your content, then don't post your content in public where an AI will see it. The Fediverse is the last place you should be posting since its very nature is about openly broadcasting your content to whoever wants to see it.

[–] prex@aussie.zone 1 points 1 month ago (1 children)

OTOH people are better at filtering out, or at least recognising gibberish than LLMs. At least for now.

You are right about the fediverse being used for training content though.

I'm curious about the levels of bot posting compared to xitter etc. A low rate here would make it even more attractive to prevent model collapse.

[–] FaceDeer@fedia.io 1 points 1 month ago (1 children)

Well, the "at least for now" part is my point - if people start using "gibberish" to communicate or to hide their communication, that provides training material for LLMs to let them figure out how to use it too.

LLMs learn how to communicate based on existing examples of communication. As long as humans are communicating with each other somehow then LLMs will be able to train how to do that too. They have the same communication capabilities that we do at this point, so there's not really any way we can make a secret clubhouse that they can't figure out how to infiltrate.

Personally, I think there's two main routes we can go to deal with this. Either we can simply accept that there's no way to be 100% sure we're talking to a human any more and evaluate the value of our conversation based on the content of the words spoken rather than the composition of the entity generating them, or we could come up with some kind of "proof of personhood" system to allow people to label the text the write as coming from them.

The latter is extremely hard to do, of course, both from a technical and cultural perspective. And such a system would likely still allow someone's "person token" to be sneakily used by AI, either by voluntarily delegating it (I could very well be retyping all of this out of a ChatGPT window) or through hackery.

So I'm inclined toward the former. If I'm chatting with someone and I'm having a good time doing it, and then later I find out it was a bot, why should that change how much fun I had?

[–] prex@aussie.zone 1 points 1 month ago (1 children)

My point is that if we turn up our gibberish dial now then at least our llms will be learning the wrong thing & we have some control.

There is still a lot of understanding that we do automatically that an llm will never do. I still 4eckon I can spot gibberish better than an llm & I would like to keep it that way

Or we just give up. As you can see I have mostly given up.

[–] FaceDeer@fedia.io 1 points 1 month ago (1 children)

My point is that if we turn up our gibberish dial now then at least our llms will be learning the wrong thing & we have some control.

We'd be covering ourselves in poop to prevent people from sitting next to us on the train. Sure, people will avoid sitting next to us, but in the meantime we'll be covered in poop.

And then other people will learn the trick, cover themselves in poop too, and now everyone's poopy and the trick stops working.

There is still a lot of understanding that we do automatically that an llm will never do.

Are you willing to bet the convenience of comprehensible online discourse on that? "Automatically understanding stuff" is basically the one job of LLMs.

LLMs model language, and coming up with some kind of "gibberish" filter is simply inventing a new language. If there's semantic meaning in it the LLMs will figure it out just like any other language, and if there isn't semantic meaning then we've lost the ability to communicate entirely. I see no upside.

[–] prex@aussie.zone 1 points 1 month ago (1 children)

Sigh

I think the point of gibberish is that it is not language.

Thats why imagination and creativity is required - no?

I feel like I'm talking to an llm right now. too many words.

[–] FaceDeer@fedia.io 1 points 1 month ago

If it's not communicating anything, what's the point?

[–] POTOOOOOOOO@reddthat.com 0 points 1 month ago

This post is satirical