this post was submitted on 10 Jul 2023
418 points (94.7% liked)

Technology

34441 readers
196 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Marxine@lemmy.ml 5 points 1 year ago (1 children)

VC backed AI makers and billionaire-ran corporations should definitely pay for the data they use to train their models. The common user should definitely check the licences of the data they use as well.

[–] SixTrickyBiscuits@lemmy.world -4 points 1 year ago (2 children)

That is essentially impossible. How are they going to pay each reddit user whose comment the AI analyzed? Or each website it analyzed? We're talking about terabytes of text data taken from a huge variety of sources.

[–] CannaVet@lemmy.world 6 points 1 year ago* (last edited 1 year ago) (1 children)

Then it should be treated as what it is, an illegal venture based off of theft. I don't get a legal pass to steal just because the groceries I stole got cooked into a meal and are therefore no longer the groceries I stole.

[–] azuth@lemmy.world -3 points 1 year ago

Firstly copyright infringement is not theft. It's not theft because the grocer still has the groceries. It is a lesser crime which obviously hurts the victim less if at all in some cases.

A summary is also not copyright infringement, it's fair use. Of course copyright holders would love to copyright strike bad reviews (they already do even though it's not illegal).

[–] Marxine@lemmy.ml 3 points 1 year ago

Billionaires can spend and burn their whole net worth for all I care. Datasets should be either:

  • Paid for to the provider platform, and each original content creator gets a share (eg. The platform keeps 10% of the sold price for hosting costs, the 90% remaining are distributed to content creators according to size and quality of the data provided)
  • Consciously donated by the content creators (eg: an OPT-IN term in the platform about donating agreed upon data for non-profit research), but the dataset must never be sold for or used for profit. Publicly available research purposes only.
  • Dataset is "rented" by the users and platform in an OPT-IN manner, and they receive royalties/payments for each purchase/usage of the dataset.

The current manner things are done only favours venture capitalists (wage thieves), shareholders (also wage thieves) and billionaire C-suits (wage thieves as well).