this post was submitted on 29 Sep 2023
436 points (93.6% liked)

Technology

59381 readers
4079 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Authors using a new tool to search a list of 183,000 books used to train AI are furious to find their works on the list.

(page 2) 50 comments
sorted by: hot top controversial new old
[–] Gutless2615@ttrpg.network 3 points 1 year ago (1 children)

Everyone’s a fan of fair use until it’s their work that is transformed.

[–] RalphWolf@lemmy.ca 3 points 1 year ago (13 children)

Does this fall under fair-use part of copyright?

[–] kromem@lemmy.world 4 points 1 year ago

The training argument is probably going to come up dry by the time the court works its way through expert testimony, as the underlying argument for training as infringement is insane.

But where OpenAI is probably in hot water is that torrenting 100k books in the first place runs afoul of existing copyright legislation.

Everyone is debating the training in these suits, but the real meat and potatoes is going to be the initial infringement of obtaining the books, not how they were subsequently used.

[–] lloram239@feddit.de 4 points 1 year ago* (last edited 1 year ago)

Authors Guild, Inc. v. Google, Inc. decided that it is fair use to scan books and make large parts of them available verbatim on the net. What AI does is far more transformative than that, as very little of a book can be reproduced verbatim with AI (e.g. popular quotes), you really just get "knowledge" from the books. The sources are however lost in the process, unlike with Google, which by itself however also makes it difficult to argue for copyright violation, since you can't point at what was actually copied.

load more comments (11 replies)
[–] originalucifer@moist.catsweat.com 2 points 1 year ago (1 children)

do they also complain when their books are used to train wet networks in public schools? those networks are also later exploited by corporations who dont give back the writers. hmmmmmmm

load more comments (1 replies)
load more comments
view more: ‹ prev next ›