this post was submitted on 02 Mar 2024
335 points (95.9% liked)
Technology
59414 readers
2831 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
You need to provide it the data. The fact they know things at all pretrained was kind of a surprise to everyone in the industry. Their current usecase as a Google replacement is really not ideally aligned with the capabilities. But the models have turned out to be surprisingly good at in context learning and are having increased context windows, so depending on the model you can absolutely provide it relevant reference material to ground the responses with a factual reference point before asking for deeper analysis. It's hard to give specific recommendations without knowing more about what you are trying to accomplish, but "they're very stupid" runs extremely counter to most of what I've seen at this point, and the rare cases where that seems to be the case there's usually something more nuanced getting in the way and a slight modification to what or how I'm asking gets past it.
Really? I find that the chat models are almost overturned to asking for more details as part of their reengagement strategy. In fact, a number of the employment related usage examples I've seen were things like users having the model ask a series of questions about work history and responsibilities in order to summarize resume fodder. So again, maybe a bit of a difference between users of the tools.
My use of the models is almost entirely related to complex scenarios and while I'd agree that something like GPT-3 is dumb as shit, GPT-4 is probably among the smarter interactions I've had in my life and I used to consult for C-suite execs of Fortune 500s. One of my favorite results was explaining the factors I suspected were influencing it getting a question wrong and it generating a correct workaround that was quite brilliant (the issue was token similarity to a standard form of a question and the proposed solution was replacing the nouns with emojis, which did bypass the similarity bias and allowed it to answer correctly when it was failing before). In spite of there being no self-introspection capabilities, giving it background details resulted in novel and ultimately correct out-of-the-box solutions.
From the sound of it, you are trying to use it for coding. I recommend switching to one of the models that specializes in that rather than using a generalist model.
And on the off chance you are using the free 3.5 version - well stop that. That one sucks and is like using an Atari when there's a PS3 available instead. Don't make the mistake of extrapolating where the tech is at based on outdated tech being provided for free as a foot on the door.