In the end, it's important for those who make lying machines to be held accountable for the falsehoods they spread. Not to mention the copyright infringement. If I write code and it uses it for training, if it's licensed under the GPL, all derivative code is copyleft. Am I wrong there? I know Microsoft trains Github Copilot on public code, does that mean they're respecting the license of the code they're using for training? I have so many questions.
Technology
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
I haven't actually used ChatGPT, but I've used the "new bing" AI. With that, whenever it states something as fact, it fixes a citation to a website where it pulled it, so you can verify things. Does ChatGPT do that as well?
No. Chat-GPT is not sourcing it's claims. I think it would also change it's (usually misunderstood) purpose. Chat-GPT is a language model made to create responses that appear natural. It's purpose was not to recite facts, although it often does so as a side effect on how it was trained, it simply creates likely word combinations. The researchers entered millions of texts and Chat-GPT ran some math to figure out which word is mostly likely to come next after each other word. So, simplified, it opperates on a likelyhood table of word relationships, generated when it was trained. This includes following up "Super Mario is" with "a video game character" as most texts it saw will refer to him as that. People mistake this as it generating facts (because if asked about things a factual response is likely because it's what Chat-GPT usually saw) but this was never the purpose of Chat-GPT. So a response like "Super Mario is an orange cat who loves lasagne" would also be valid output, as it perfectly resembles natural language. It's factually wrong but a correct sentence and after the first switch from "video game character" to cat, following up "orange cat" with love for lasagne is again a likely sentence. This is also what happens in most "made up" sentences. Chat-GPT takes a wrong turn somewhere (maybe because it does not have facts on a person or thing, maybe because it's ambiguous or maybe because the user was actually trying to direct it into that sentence) but then continues to follow up with likely words. And once you realise that Chat-GPT always tries to create a natural, logical response, you can easily trick it into making up certain things. So if you ask about lawsuits regarding a certain person, it'll create a natural sentence describing a lawsuit that this person was involved in that is entirely made up. But most importantly: the text will be grammatically correct and appear natural. And if Chat-GPT would respond "This person was never involved in a lawsuit" you can often simply say "OK, but let's pretend this person was" and Chat-GPT will happily make something up.
Bing's purpose is different. It of course also has that "natural language" approach but additionally might actually run a web search to be able to quote and source it's claims, whereas Chat-GPT does not even know what's written on todays websites. It has access to it's initial training data from September 2021 and a few additional datasets they used since then for refinement, but it has no live access to the Internet to look up up-to-date information. So Bing likely will be able to summarise the last Apple announcements where Chat-GPT will just say "sorry, I do not have that information ". If pushed, Chat-GPT might make up correct natural language sentences about that conference but the statement will be just be likely word combinations, not facts.
Interesting, I had assumed it was also doing some kind of live query when asked. Given that, I'm surprised there's as much conversation and popularity as opposed to "new bing," which I'm sure is a huge irritation for Microsoft. I know it has a crappy name, and "bing" doesn't really have a sexy connotation anyways, but it seems like it provides a lot more of what people actually want from ChatGPT (although my understanding is new bing is based on ChatGPT v3 so in some ways is ChatGPT)
Isn't there basically a disclaimer at the end of every response you get from chatgpt? Or am I just asking sketchy questions?
@Winterismyseason @Powderhorn in lot of cases disclaimers are not legally binding and cannot absolute ChatGPT from legal troubles. Heck, even majority of EULAs are legal fluff.
That was my thinking. OpenAI may feel their disclaimers are airtight, but courts get the final say on such things. This sort of thing was bound to happen, and I'm glad no one suffered physical injury or death for the first lawsuit to be filed.
@Powderhorn lot of tech billionaires are live in wishful thinking world where they disrupt and void real world - including legal - consequences. They usually come crashing down. It is just surprising that big swaths of media absolutely never learn from this.
When the future doesn't exist past next round of earnings, there's no upside to learning.
@Powderhorn yeah, there is lot of sort of self-justification happening there in a nutshell.
It is not that they are unique with this. Overall humanity still learns how to deal with their own quirks.
Unrelated, is it considered proper etiquette to @ users one is replying to every time? I have no issue doing so; I just didn't see a need. But it's been pointed out that comment threads aren't rendered the same for all users, so I can certainly start doing so.
@Powderhorn as far as I have seen, Fediverse somehow tracks thread without mentions. So for me it is up to someone's preference :) All Mastodon clients I use prefix messages with mentions of persons I reply to, but they don't have to.
Thanks for the explanation!
I think any lawyer using a tool like this and not pulling the actual relevant case law is simply a moron. I’m not a lawyer, but I am a licensed engineer, and if I used ChatGPT to help kickstart any design thing, I’d take that result with an entire block of salt, and double check everything I was using, because ultimately I am responsible for the design. Same here with the lawyers.