I saw this headline a couple of times, and tried unsuccessfully to work out the metaphor. Finally googled it:
A company with no moat has either no advantage or one expected to dissipate relatively quickly.
This is a most excellent place for technology news and articles.
I saw this headline a couple of times, and tried unsuccessfully to work out the metaphor. Finally googled it:
A company with no moat has either no advantage or one expected to dissipate relatively quickly.
And I disagree with it too. And it's not because of how good the models are in technical terms, the corporate juggernauts are only just ahead of OSS on that front... it's server space and the money to acquire it that is the moat.
An average internet user will not install the Vicunas and the Pygmalions and the LLaMAs of the LLM space. Why?
For one, the field is too complicated to get into, but, more importantly, a lot of people can't.
Even the lowest complexity models require a PC + graphics card with a fairly beefy amount of VRAM (6GB at bare minimum), and the ones that can go toe-to-toe with ChatGPT are barely runnable on even the most monstrous of cards. No one is gonna shell out 1500 bucks for the 4090 just so they can run Vicuna-30B.
They are gonna use online, free-to-use, no BS, no technical jargon LLM services. All the major companies know that.
ChatGPT and services like it have made the expectation: "just type it in, get amazing response in seconds, no matter where".
OSS can't beat that, at least not right now. And until it can, the 99% will be in Silicon Valley's jaws.
Given the pace of oss optimisation, I fully expect the requirements for a gpt3.5 equivalent performance model to be much lower in the coming year. The biggest issues are around training or fine tuning right now. Inference is cheaper, resource wise. For truly large models, the moat is most definitely gpu compute and power constraints. Those who own their own gpu farms will be at an advantage until there is significant increase in cloud gpu capacity - right now, cloud gpu is at a premium, and can also include wait time for access. I don't expect this to change in the next year or two.
Tl;dr; moat is real, but it's gpu and power constraints.
I hope to god you are right. What will truly be a revolution is if somehow these models can be transitioned to CPU-bound rather than GPU without completely tanking performance. Then we can start talking about running it on phones and laptops.
But I don't know how much more you can squeeze out of the LLM stone. I'm surprised that we got what was essentially a brute-forcing of concepts, with massive catalogs of data, rather than one more hand-crafted/built from scratch. Maybe there is another way to go about? God I hope so, so OSS can use it before the big guys convince governments to drop the hammer.
I can see most individuals and SMBs going with specialist "good enough" models which they can run on prem/ locally, leaving the truly huge systems to those with compute to spare. The security model for these MAAS systems is pretty much "trust me bro". A lot of companies will not want to, or be able to, trust such a system. PI/CID can not be left in the hands of the ai as a service company. They will have to either go on prem, or stand up their own models in their private cloud. Again, this limits model size for orgs, available compute etc. This points to using available models, optimised, etc. OSS FTW (I hope)
Have you looked into AIHorde?
It's clearly harder to use than the commercial alternatives but at first glance it doesn't seem to bad.
It looks about as complicated as setting up any of the other volunteer compute projects (like SETI@home).
I didn't know about it, but it looks really neat. Gonna give it a spin to help me summarize documentation.
Edit: I meant to reply to the article, sorry 😅