this post was submitted on 09 Aug 2023

173 points (94.4% liked)

Technology

34795 readers

285 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago

MODERATORS

MinutePhrase@lemmy.ml

173

ChatGPT gets code questions wrong 52% of the time (www.theregister.com)

submitted 1 year ago by yogthos@lemmy.ml to c/technology@lemmy.ml

35 comments fedilink hide all child comments

all 37 comments

sorted by: hot top controversial new old

[–] theluddite@lemmy.ml 45 points 1 year ago (2 children)

The real problem with LLM coding, in my opinion, is something much more fundamental than whether it can code correctly or not. One of the biggest problems coding faces right now is code bloat. In my 15 years writing code, I write so much less code now than when I started, and spend so much more time bolting together existing libraries, dealing with CI/CD bullshit, and all the other hair that software projects has started to grow.

The amount of code is exploding. Nowadays, every website uses ReactJS. Every single tiny website loads god knows how many libraries. Just the other day, I forked and built an open source project that had a simple web front end (a list view, some forms -- basic shit), and after building it, npm informed me that it had over a dozen critical vulnerabilities, and dozens more of high severity. I think the total was something like 70?

All code now has to be written at least once. With ChatGPT, it doesn't even need to be written once! We can generate arbitrary amounts of code all the time whenever we want! We're going to have so much fucking code, and we have absolutely no idea how to deal with that.

[–] BloodyDeed@feddit.ch 12 points 1 year ago* (last edited 1 year ago) (1 children)

This is so true. I feel like my main job as a senior software engineer is to keep the bloat low and delete unused code. Its very easy to write code - maintaining it and focusing on the important bits is hard.

This will be one of the biggest and most challenging problems Computer Science will have to solve in the coming years and decades.

[–] floofloof@lemmy.ca 6 points 1 year ago* (last edited 1 year ago)

It's easy and fun to write new code, and it wins management's respect. The harder work of maintaining and improving large code bases and data goes mostly unappreciated.

[–] AlexWIWA@lemmy.ml 3 points 1 year ago

Makes the Adeptus Mechanicus look like a realistic future. Really advanced tech, but no one knows how it works

[–] SirGolan@lemmy.sdf.org 19 points 1 year ago* (last edited 1 year ago) (3 children)

Wait a second here... I skimmed the paper and GitHub and didn't find an answer to a very important question: is this GPT3.5 or 4? There's a huge difference in code quality between the two and either they made a giant accidental omission or they are being intentionally misleading. Please correct me if I missed where they specified that. I'm assuming they were using GPT3.5, so yeah those results would be as expected. On the HumanEval benchmark, GPT4 gets 67% and that goes up to 90% with reflexion prompting. GPT3.5 gets 48.1%, which is exactly what this paper is saying. (source).

[–] Corkyskog@sh.itjust.works 3 points 1 year ago (2 children)

Is GPT4 publicly available?

[–] newIdentity@sh.itjust.works 3 points 1 year ago

Yes... If you pay $20 a month

[–] SirGolan@lemmy.sdf.org 3 points 1 year ago

Yes available to anyone in the API or anyone who pays for ChatGPT subscription.

[–] floofloof@lemmy.ca 2 points 1 year ago (1 children)

Whatever GitHub Copilot uses (the version with the chat feature), I don't find its code answers to be particularly accurate. Do we know which version that product uses?

[–] SirGolan@lemmy.sdf.org 3 points 1 year ago

If we are talking Copilot then that's not ChatGPT. But I agree it's ok. Like it can do simple things well but I go to GPT 4 for the hard stuff. (Or my own brain haha)

[–] yogthos@lemmy.ml -3 points 1 year ago

Oh that's possible, not sure which one they used either.

[–] s20@lemmy.ml 19 points 1 year ago

If I'm going to use AI for something, I want it to be right more often than I am, not just as often!

[–] r00ty@kbin.life 15 points 1 year ago (3 children)

I used ChatGPT once. It created non functional code. But, the general idea did help me get to where I wanted. Maybe it works better as a rubber duck substitute?

[–] GBU_28@lemm.ee 2 points 1 year ago

Use it as a boilerplate blaster, for shit you could write yourself

[–] dom@lemmy.ca 1 points 1 year ago* (last edited 1 year ago)

I did my first game jam with the help of chat gpt. It didn't write any code in the game, but I was able to ask it how to accomplish certain things generally and it would give me ideas and it would be up to me to implement.

There were other things I knew my engine could do but i couldn't figure out using the documentation, ao I would ask chat gpt "how do you xyz in godot" and it would give me step by step. This was especially useful for the things that get done in the engine ui and not in code.

[–] yogthos@lemmy.ml 1 points 1 year ago (1 children)

Yeah, generating some ideas to get you going might be the best use for this kind of stuff.

[–] WarmSoda@lemm.ee 6 points 1 year ago* (last edited 1 year ago) (2 children)

That's how I view AI generated art. It can come up with some really cool mash ups. But you have to do the rest. Anyone just using what it outputs like that's the end of the story isn't 'using it right' in my opinion.

[–] EssentialCoffee@midwest.social 1 points 1 year ago (1 children)

I'm not sure there's a way to 'use art right.'

[–] WarmSoda@lemm.ee 2 points 1 year ago* (last edited 1 year ago) (1 children)

You're obviously not an artist. And you managed to completely miss my point.

[–] EssentialCoffee@midwest.social 0 points 1 year ago (1 children)

No, but my husband is and he's been refining keywords and using all sorts of loras and all other types of jargon that I don't recall because I'm not interested in doing it myself.

And I didn't miss your point, I just don't agree with it.

[–] WarmSoda@lemm.ee 0 points 1 year ago* (last edited 1 year ago) (1 children)

So what, are you the art version of a military wife? Just throwing "aCtuALly"s out into the void because your husband types words into a field?

[–] EssentialCoffee@midwest.social 1 points 1 year ago (1 children)

Better than being the art version of an asshole, but I see you're already filling that role.

[–] WarmSoda@lemm.ee 1 points 1 year ago* (last edited 1 year ago)

Ok Karen. Go dependa somewhere.

[–] yogthos@lemmy.ml -1 points 1 year ago

Right, I expect stuff like stable diffusion will become a part of the toolkit actual artists use. The workflows with this stuff are already getting pretty intricate where people use control net for posing, and inpainting of specific details, and so on. I would liken it to doing photography. You can't just give a camera to anybody and get good results, it takes a person with a skill and taste to produce an interesting image.

[–] Fluffles@pawb.social 12 points 1 year ago (1 children)

I believe this phenomenon is called "artificial hallucination". It's when a language model exceeds its training and makes info out of thin air. All language models have this flaw. Not just ChatGPT.

[–] yogthos@lemmy.ml 8 points 1 year ago (1 children)

The fundamental problem is that at the end of the day it's just a glorified Markov chain. LLM doesn't have any actual understanding of what it produces in a human sense, it just knows that particular sets of tokens tend to go together in the data it's been trained on. GPT mechanic could very well be a useful building block for making learning systems, but a lot more work will need to be done before they can actually be said to understand anything in a meaningful way.

I suspect that to make a real AI we have to embody it in either a robot or a virtual avatar where it would learn to interact with its environment the way a child does. The AI has to build an internal representation of the physical world and its rules. Then we can teach it language using this common context where it would associate words with its understanding of the world. This kind of a shared context is essential for having AI understand things the way we do.

[–] v_krishna@lemmy.ml 4 points 1 year ago (1 children)

A lot of semantic NLP tried this and it kind of worked but meanwhile statistical correlation won out. It turns out while humans consider semantic understanding to be really important it actually isn't required for an overwhelming majority of industry use cases. As a Kantian at heart (and an ML engineer by trade) it sucks to recognize this, but it seems like semantic conceptualization as an epiphenomenon emerging from statistical concurrence really might be the way that (at least artificial) intelligence works

[–] yogthos@lemmy.ml 0 points 1 year ago (1 children)

I don't see the approaches as mutually exclusive. Statistical correlation can get you pretty far, but we're already seeing a lot of limitations with this approach when it comes to verifying correctness or having the algorithm explain how it came to a particular conclusion. In my view, this makes purely statistical approach inadequate for any situation where there is a specific result desired. For example, an autonomous vehicle has to drive on a road and correctly decide whether there are obstacles around it or not. Failing to do that correctly results in disastrous results and makes purely statistical approaches inherently unsafe.

I think things like GPT could be building blocks for systems that are trained to have semantic understanding. I think what it comes down to is simply training a statistical model against a physical environment until it adjusts its internal topology to create an internal model of the environment through experience. I don't expect that semantic conceptualization will simply appear out of feeding a bunch of random data into a GPT style system though.

[–] v_krishna@lemmy.ml 2 points 1 year ago (1 children)

I fully agree with this, would have written something similar but was eating lunch when I made my former comment. I also think there's a big part of pragmatics that comes from embodiment that will become more and more important (and wish Merleau-Ponty was still around to hear what he thinks about this)

[–] yogthos@lemmy.ml -3 points 1 year ago (1 children)

Indeed, I definitely expect interesting things to start developing on that front, and we may see old ideas getting dusted off because now there's enough computing power to put them to use. For example, I thought The Society of Mind from Minsky lays out a plausible architecture for a mind. Imagine each agent in that scenario being a GPT system, and the bigger mind being built out of a society of such agents each being concerned with a particular domain it learns about.

[–] v_krishna@lemmy.ml 1 points 1 year ago (1 children)

Many (14?) years back I attended a conference (now I can't remember what it was for, I think a complex systems department at some DC area university) and saw a lady give a talk about using agent based modeling to do computational sociology planning around federal (mostly navy/army) development in Hawaii. Essentially a sim city type of thing but purpose built to help aid in public planning decisions. Now imagine that but the agents aren't just sets of weighted heuristics but instead weighted heuristic/prompt driven LLMs with higher level executive prompts to bring them together.

[–] yogthos@lemmy.ml -3 points 1 year ago* (last edited 1 year ago)

I'm really excited to see this kind of stuff experimented with. I find it's really useful of thinking of machine learning agent training in terms of creating a topology through balancing of the weights and connections that ends up being a model of a particular domain described by the data that it's being fed. The agent learns patterns in the data it observes and creates an internal predictive model based on that. Currently, most machine learning systems seem to focus on either individual agents or small groups such as adding a supervisor. It would be interesting to see large graphs of such agents that interact in complex ways and where high level agents are only interacting with other agents and don't even need to see any of the external inputs directly. One example would be to have a system trained on working with visual input and another with audio, and then have a high level system that's responsible for integrating these inputs and doing the actual decision making.

and just ran across this https://arxiv.org/abs/2308.00352

[–] Pleonasm@programming.dev 8 points 1 year ago (1 children)

I was pretty impressed with it the other day, it converted ~150 lines of Python to C pretty flawlessly. I then asked it to extend the program by adding a progress bar to the program and that segfaulted, but it was immediately able to discover the segfault and fix it when I mentioned. Probably would have taken me an hour or two to write myself and ChatGPT did it in 5 minutes.

[–] Afghaniscran@feddit.uk 8 points 1 year ago

I used it to code small things and it worked eventually whereas if I just decided to learn coding I'd be stuck cos I don't do computers, I do hvac.

[–] humanplayer2@lemmy.ml 1 points 1 year ago

Condorcet sobs "so close".