Technology

37717 readers

470 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

Los@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

Stack Overflow mods go on strike (openletter.mousetail.nl)

submitted 1 year ago by salarua@sopuli.xyz to c/technology@beehaw.org

26 comments fedilink hide all child comments

and as always, the culprit is ChatGPT. Stack Overflow Inc. won't let their mods take down AI-generated content

you are viewing a single comment's thread
view the rest of the comments

[–] kevin@beehaw.org 3 points 1 year ago* (last edited 1 year ago) (3 children)

I imagine it'll be possible in the near future to improve the accuracy of technical AI content somewhat easily. It'd go something along these lines: have an LLM generate a candidate response, then have a second LLM capable of validating that response. The validator would have access to real references it can use to ensure some form of correctness, ie a python response could be plugged into a python interpreter to make sure it, to some extent, does what it is proported to do. The validator then decides the output is most likely correct, or generates some sort of response to ask the first LLM to revise until it passes validation. This wouldn't catch 100% of errors, but a process like this could significantly reduce the frequency of hallucinations, for example.

[–] orclev@lemmy.ml 8 points 1 year ago

The validator would have access to real references it can use to ensure some form of correctness

That's the crux of the problem, a LLM has no understanding of what it's saying, it doesn't know how to use references. All it knows is that in similar contexts this set of words tended to follow this other set of words. It doesn't actually understand anything. It's capable of producing output that looks correct to a casual glance but is often wildly wrong.

Just look at that legal filing that idiot lawyer used ChatGPT to generate. It produced fake references that were trivial for a real lawyer to spot because they used the wrong citation format for the district they were supposedly from. They looked like real citations because they were based on how real citations looked but it didn't understand that citations have different styles depending on the court district and that the claimed district and citation style must match.

LLMs are very good at producing convincing sounding bullshit, particular for the uninformed.

I saw a post here the other day where someone was saying they thought LLMs were great for learning because beginners often don't know where to start. There might be some merit to that if it's used carefully, but by the same token that's incredibly dangerous because it often takes very deep knowledge to see the various ways the LLMs output is wrong.

[–] Tutunkommon@beehaw.org 4 points 1 year ago

Best description I've heard is that LLM is good at figuring out what the correct answer should look like, not necessarily what it is.

[–] FlowVoid@midwest.social 4 points 1 year ago* (last edited 1 year ago)

The validator would have access to real references

And who wrote the "real" references?

Because that's the point of the post you replied to. LLMs can't completely replace humans, because only humans can make new "real references".