Technology

59436 readers

3000 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

Microsoft CTO Kevin Scott thinks LLM “scaling laws” will hold despite criticism (arstechnica.com)

submitted 4 months ago by Alphane_Moon@lemmy.world to c/technology@lemmy.world

10 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] autotldr@lemmings.world 3 points 4 months ago

This is the best summary I could come up with:

"And I try to help people understand there is an exponential here, and the unfortunate thing is you only get to sample it every couple of years because it just takes a while to build supercomputers and then train models on top of them."

The laws suggest that simply scaling up model size and training data can lead to significant improvements in AI capabilities without necessarily requiring fundamental algorithmic breakthroughs.

The perception has been fueled by largely informal observations—and some benchmark results—about recent models like Google's Gemini 1.5 Pro, Anthropic's Claude Opus, and even OpenAI's GPT-4o, which some argue haven't shown the dramatic leaps in capability seen in earlier generations, and that LLM development may be approaching diminishing returns.

Scott's stance suggests that tech giants like Microsoft still feel justified in investing heavily in larger AI models, betting on continued breakthroughs rather than hitting a capability plateau.

Some perceptions of slowing progress in LLM capabilities and benchmarking may be due to the rapid onset of AI in the public eye when, in fact, LLMs have been developing for years prior.

In the podcast interview, the Microsoft CTO pushed back against the idea that AI progress has stalled, but he acknowledged the challenge of infrequent data points in this field, as new models often take years to develop.

The original article contains 697 words, the summary contains 217 words. Saved 69%. I'm a bot and I'm open source!