this post was submitted on 03 May 2024
57 points (90.1% liked)

Ask Lemmy

26903 readers
1579 users here now

A Fediverse community for open-ended, thought provoking questions

Please don't post about US Politics. If you need to do this, try !politicaldiscussion@lemmy.world


Rules: (interactive)


1) Be nice and; have funDoxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them


2) All posts must end with a '?'This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?


3) No spamPlease do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.


4) NSFW is okay, within reasonJust remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either !asklemmyafterdark@lemmy.world or !asklemmynsfw@lemmynsfw.com. NSFW comments should be restricted to posts tagged [NSFW].


5) This is not a support community.
It is not a place for 'how do I?', type questions. If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email info@lemmy.world. For other questions check our partnered communities list, or use the search function.


Reminder: The terms of service apply here too.

Partnered Communities:

Tech Support

No Stupid Questions

You Should Know

Reddit

Jokes

Ask Ouija


Logo design credit goes to: tubbadu


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] drmoose@lemmy.world 31 points 6 months ago* (last edited 6 months ago) (2 children)

no it's not better. It's extremely invasive as you have to fingerprint and store users fingerprint on your servers indefinitely. Not only that but all of this can be avoided by anyone with half a brain cell. Lemmy should not waste their resources on something like this, it's extremely hard to do to the point where literally nobody has a good system even giants like Linkedin. Source, I work in bot detection.

Lemmy would never get this right no matter how many people contributed and would just cause overal harm to the platform through privacy invasion and false positives.

[–] LWD@lemm.ee 3 points 6 months ago (2 children)

Lemmy has quite a few unfortunately invasive qualities of its own, including generally needing an email address from you (Reddit does not), having poor privacy and data retention practices, and generally being very messy with who gets to decide what happens with your data and how easily it can be scraped.

Sure, Reddit sells it... But Lemmy gives it to any web scraper for free.

[–] drmoose@lemmy.world 11 points 6 months ago (1 children)

But Lemmy gives it to any web scraper for free

Which is good. You either have an open system or a closed one. There's no in-between.

If you want to have advantages of public free decentralized network you can't obfuscate and centralize bits and pieces of it. Also, it's 2024, we need to stop this misinformation that email address is supposed to be private. What is private is email address association with the owner and Lemmy doesn't leak or infringe on. The address is literally called address because it's supposed to be public.

[–] LWD@lemm.ee 2 points 6 months ago* (last edited 6 months ago) (2 children)

...And attitudes like this towards privacy will keep Lemmy from progressing to a point where those issues will be fixed.

I have a fundamental problem with giant corporations scraping user data without user consent. That's a system-level issue. It doesn't become "good" just because they get to scrape without consent for free.

[–] drmoose@lemmy.world 2 points 6 months ago

Nah it has nothing to do with attitude but with practicality. This would mean people's fingerprints need to be public and shared between servers or some other hack. It's just possible in any safety and its not really a hill worth dying on. Do we really care about users dodging subreddit bans that much? Its silly.

[–] can@sh.itjust.works 1 points 6 months ago (1 children)

What would a "fix" look like in your eyes? Do you have na implementation in mind?

[–] LWD@lemm.ee 1 points 6 months ago (1 children)

I have a few suggestions for development concerns off the top of my head:

  • Scrub post metadata* after users request its deletion
  • Auto-purge deleted content* rather than letting it sit behind a "deleted" flag (something Facebook got a ton of flak for doing)
  • Auto-purge deleted media*
  • Consider seriously limiting opening data wide for scraping, since the problem is non-consensual scraping, not payment for non-consensual scraping

* either immediately or, to prevent spam, after some time

[–] can@sh.itjust.works 1 points 6 months ago (1 children)

I agree with your first few points but I'm unsure about the scraping. This is a public forum, what could be done to mitigate scraping that wouldn't take away form that?

[–] LWD@lemm.ee 1 points 6 months ago

If we take "unlimited unauthenticated API access shouldn't be possible" for granted, I'm unfortunately not all that technically competent about what can be done next.

The first thing that comes to mind is treating website access and app access differently, maybe limiting app API access by default for people who haven't logged in.

Or creating a separate bot API that's rolled out across all servers at some point in the future... And I know federation could pose some serious chokepoints here so that's where my speculation ends.

[–] andrew_bidlaw@sh.itjust.works 8 points 6 months ago

Instances can enable or disable the email verification and other measures, like asking why you want to join that instance.

I don't recall reddit being so liberal. I haven't used an account I didn't verified in the same day, so I can't say if it works, but I suspect they can enable different protocols for inspecting unverified accs.

As a side-note to that discussion: my VPN works with most services i can't access otherwise while reddit blocked me as I tried to access it to see for myself. I'm surprised.

[–] PhlubbaDubba@lemm.ee 0 points 6 months ago (1 children)

Keeping a list of "fingerprints" of users is hardly invasive, and it's only dangerous without proper database security.

It can throw up false positives, but the key there is to make it as good at not doing that as possible, and having a reasonable means for users who feel like they were unfairly tagged as evaders to appeal the flag.

Also, don't do it automatically, use it as a tool to identify possible cases and have a review team check for which ones need the most immediate action, with help from a separate algorithm that prioritizes user reports by how reliably a users' reports have pinged actionable content.

That's the entire game of security, not being perfect, but being good enough for the adversary to decide you might as well be perfect for all their efforts would be worth, and ban evasion protection and bot prevention are no different.

[–] drmoose@lemmy.world 6 points 6 months ago (1 children)

That's the entire game of security, not being perfect, but being good enough

Yes and good enough is so hard to reach that this is no way accomplished with Lemmys volunteer resources. We literally have full time people and massive AI driven systems doing this professionally. This is no way achievable in Lemmy if centralized Reddit with multi-million dollar budgets can't even get close to "good enough".

[–] PhlubbaDubba@lemm.ee 3 points 6 months ago

TBF Reddit isn't exactly trying all that hard since ban evaders tend to be good for engagement metrics. Like half the measures they do employ they only do because they feel like they have to in order to not look like they just blatantly don't give a shit so long as the investor watched metrics keep going up.