this post was submitted on 26 Jul 2023
846 points (96.4% liked)
Technology
59340 readers
5586 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
You're getting lost in the weeds here and completely misunderstanding both copyright law and the technology used here.
First of all, copyright law does not care about the algorithms used and how well they map what a human mind does. That's irrelevant. There's nothing in particular about copyright that applies only to humans but not to machines. Either a work is transformative or it isn't. Either it's derivative of it isn't.
What AI is doing is incorporating individual works into a much, much larger corpus of writing style and idioms. If a LLM sees an idiom used a handful of times, it might start using it where the context fits. If a human sees an idiom used a handful of times, they might do the same. That's true regardless of algorithm and there's certainly nothing in copyright or common sense that separates one from another. If I read enough Hunter S Thompson, I might start writing like him. If you feed an LLM enough of the same, it might too.
Where copyright comes into play is in whether the new work produced is derivative or transformative. If an entity writes and publishes a sequel to The Road, Cormac McCarthy's estate is owed some money. If an entity writes and publishes something vaguely (or even directly) inspired by McCarthy's writing, no money is owed. How that work came to be (algorithms or human flesh) is completely immaterial.
So it's really, really hard to make the case that there's any direct copyright infringement here. Absorbing material and incorporating it into future works is what the act of reading is.
The problem is that as a consumer, if I buy a book for $12, I'm fairly limited in how much use I can get out of it. I can only buy and read so many books in my lifetime, and I can only produce so much content. The same is not true for an LLM, so there is a case that Congress should charge them differently for using copyrighted works, but the idea that OpenAI should have to go to each author and negotiate each book would really just shut the whole project down. (And no, it wouldn't be directly negotiated with publishers, as authors often retain the rights to deny or approve licensure).
you're accusing me of what you are clearly doing after I've explained twice how you're doing that. I'm not going to waste my time doing it again. except:
except that the contention isn't necessarily over what work is being produced (although whether it's derivative work is still a matter for a court to decide anyway), it's regarding that the source material is used for training without compensation.
and, likewise, so are these companies who have been using copyrighted material - without compensating the content creators - to train their AIs.
That wouldn't be copyright infringement.
It isn't infringement to use a copyrighted work for whatever purpose you please. What's infringement is reproducing it.
and you accused me of "completely misunderstanding copyright law" lmao wow
Yes. I do. And I'm right.
lmao
It's infringement to use copyrighted material for commercial purposes.
If I buy my support staff "IT for Dummies", and they then, sometimes, reproduce the same/similar advice (turn it off and on again), I owe the textbook writers money? That's news to me.
No, it isn't. There are enumerated rights a copyright grants the holder a monopoly over. They are reproduction, derivative works, public performances, public displays, distribution, and digital transmission.
Commercial vs non-commercial has nothing to do with it, nor does field of endeavor. And aside from the granted monopoly, no other rights are granted. A copyright does not let you decide how your work is used once sold.
I don't know where you guys get these ideas.