I keep hearing this about Arch: could you educate a noob on what kinds of things I'd be dealing with? I'm comfortable with Linux in general but somewhat apprehensive with what I hear about Arch
yacgta
Specifically on what LLM to use, I've been meaning to try Starcoder, but can't vouch for how good it is. In general I've found Vicuna-13B pretty good at generating code.
As for general recommendations, I'd say the main determinant will be if you can afford the hardware requirements to locally host - I presume you're familiar with the fact that you'll (usually) need roughly 2x the number of parameters in VRAM (e.g. 7B parameters means 14GB of VRAM). Techniques like quantization to 8-bits halve the requirement, with the more extreme 4-bit quantization halving them again (at the expense of generation quality).
And if you don't have enough VRAM, there's always llama.cpp - I think that list of supported models is outdated, and it supports way more than those.
On the "what software to use for self-hosting" I've quite liked FastChat, they even have a way to run an OpenAI API compatible server, which will be useful if your tools expect OpenAI.
Hope this is helpful!
That was it, thank you!
I have a few of these but I forget where they came from, curious if anyone here knows
Ivory for Mastodon also has it, and I'm assuming it came from their Twitter client (never used it myself though)
Did you try manually setting the TDP? In Civ VI I set it to 7W (default graphics) and it has helped quite a bit.
I mean, this is Google we're talking about...
I quite like Tailscale SSH for this, but I don't have as many machines, so not sure how it will scale. You can definitely assign roles here to allow/deny SSH between hosts in your fleet though.