this post was submitted on 14 Apr 2024
270 points (91.7% liked)

Futurology

1773 readers
92 users here now

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] CanadaPlus@lemmy.sdf.org 0 points 7 months ago (1 children)

It's not a search algorithm. If it is, that's an overfitted model, and it's detected and rejected. What a good foundation model is doing is just about as mysterious as the brain.

[–] conciselyverbose@sh.itjust.works 1 points 7 months ago (1 children)

It's fundamentally extremely comparable mathematically and algorithmically. That's the point. Simulated annealing doesn't need to understand the search space to find a pretty good answer to a problem. It just needs to know what a good answer approximately looks like and nudge potential answers closer that way.

What LLMs are doing is not mysterious at all. Why a specific point in a model is what it is is, but there's no mystery to the algorithm. We can't even guess at most of the algorithms that make up the brain.

[–] CanadaPlus@lemmy.sdf.org 1 points 7 months ago (1 children)

Simulated annealing is a search algorithm which finds a solution.

Backpropagation is a search algorithm which finds a function, which in a big enough network could be literally any of them that are computable. Once the network is trained and rolls out for consumers, backpropagation isn't used at all.

Those are two fundamentally different things. GPT-2 is trained, and is no longer a search algorithm by any useful definition. There's examples of small neural nets we can understand, and they're not doing search algorithms; Quanta did a story about some just last week. If you can do simulated annealing you should probably just look into NN algorithms in detail yourself, because then you can know how that's wrong without the internet's help.

[–] conciselyverbose@sh.itjust.works 1 points 7 months ago* (last edited 7 months ago) (1 children)

I'm not calling it a search algorithm. I'm saying they all do the same math, and doing the math with more parallelism and variables doesn't make what it is a mystery.

Search algorithms searching for functions isn't new. Not knowing what each parameter corresponds to because you made your model huge doesn't make LLMs a mystery. It's still functionally one part. The hormone system is as complex as LLMs. Regulation of neurotransmitters is as complex as LLMs. Ignoring those external factors that are critical to how it works, individual portions of the brain are more complex than LLMs, then are all interconnected on top of that.

I fully believe we'll get to AGI eventually (probably not before we understand the brain a lot better), but the idea that one pretty simple algorithm is going to get us there is crazy. Human intelligence is a system of disparate systems of disparate systems at minimum.

[–] CanadaPlus@lemmy.sdf.org 1 points 7 months ago (1 children)

So does having more parts make something a mystery, like the second paragraph, or not a mystery like the first?

I was a skeptic back in the day too, but they've already far exceeded what an algorithm I could write from memory seems like it should be able to do.

[–] conciselyverbose@sh.itjust.works 1 points 7 months ago (1 children)

A combination of unique, varied parts is a complex algorithm.

A bunch of the same part repeated is a complex model.

Model complexity is not in any way similar to algorithmic complexity. They're only described using the same word because language is abstract.

[–] CanadaPlus@lemmy.sdf.org 1 points 7 months ago* (last edited 7 months ago)

So I guess it comes down to a neurology question. How much algorithmic complexity have we found in the brain?

As far as I'm aware, we've found a few islands of neurons that work together in an obvious way, to track location on a grid for example, and hormone cycles that form a nice negative feedback loop, to keep you at an acceptable blood-sugar level for example. Most of it is still a mystery glob of neurons and other cells, albeit with a fixed pattern of layers and folds.

If we had measured massive algorithmic complexity in the brain, I'd agree with you. As it is, though, it seems unclear how much of the structure we see is conventional algorithm, and how much is the equivalent of an ANN architecture, that ultimately does the same job as no structure but learns more efficiently, or even is just a biological spandrel.