this post was submitted on 23 Sep 2023
206 points (91.2% liked)
Technology
59454 readers
4427 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
That's basically all that I'm talking about here, yeah. I'm saying that the current laws don't appear to say anything against training AIs off of public data. The AI model is not a copy of that data, nor is its output.
Indeed. Things are not illegal by default, there needs to be a law or some sort of precedent that makes them illegal. In the realm of LLMs that's very sparse right now for exactly the reason you say. Nobody anticipated it so nobody wrote any laws forbidding it.
There are things that you can use intellectual property for that do not require consent in the first place. Fair use describes various categories of that. If it's not illegal to use copyrighted material without permission when training AIs, why would it matter whether the license permitted it or the author consented to it?
Wouldn't requiring licensing of data for the training of LLMs stack things even more in the favour of big IP-owning platforms?
Again, as I said before, if you think some specific bit of LLM output is violating the copyright of some code you wrote, there's already laws in place specifically covering that situation. You can go to court and show that the two pieces of code are substantially identical and sue for damages or whatever. The AI model itself is another matter, though, and I doubt any current laws would count it as a "copy" of the data that went into training it.