LocalLLaMA

2235 readers

13 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 1 year ago

MODERATORS

pax@sh.itjust.works

SkySyrup@sh.itjust.works

noneabove1182@sh.itjust.works

Discovering Locally Run Language Models: Share Your Favorites/Not So Favorites! (beehaw.org)

submitted 1 year ago* (last edited 1 year ago) by dtlnx@beehaw.org to c/localllama@sh.itjust.works

7 comments fedilink hide all child comments

Let's talk about our experiences working with different models, either known or lesser-known.

Which locally run language models have you tried out? Share your insights, challenges, or anything you found interesting during your encounters with those models.

top 7 comments

sorted by: hot top controversial new old

[–] Kerfuffle@sh.itjust.works 0 points 1 year ago (1 children)

guanaco-65B is my favorite. It's pretty hard to go back to 33B models after you've tried a 65B.

It's slow and requires a lot of resources to run though. Also, not like there are a lot of 65B model choices.

[–] planish@sh.itjust.works 1 points 1 year ago* (last edited 1 year ago) (1 children)

What do you even run a 65b model on?

[–] Kerfuffle@sh.itjust.works 1 points 1 year ago (1 children)

With a quantized GGML version you can just run on it on CPU if you have 64GB RAM. It is fairly slow though, I get about 800ms/token on a 5900X. Basically you start it generating something and come back in 30minutes or so. Can't really carry on a conversation.

[–] planish@sh.itjust.works 1 points 1 year ago

Is it smart enough that it can get the thread of what you are looking for without as much rerolling or handholding, so this comes out better?

[–] Yahma@kbin.social 0 points 1 year ago

Guanaco, WizardLM (uncensored) and Camel-13b have been the best models I've tried that are 13b+.

Surprisinly, the LaMini-LM (Flan 3b) and OpenLlama (3b) have performed very well for smaller models.

[–] dtlnx@beehaw.org 0 points 1 year ago (1 children)

I'd have to say I'm very impressed with WizardLM 30B (the newer one). I run it in GPT4ALL, and even though it is slow the results are quite impressive.

Looking forward to Orca 13b if it ever releases!

[–] micheal65536@lemmy.micheal65536.duckdns.org 1 points 1 year ago

Which one is the "newer" one? Looking at the quantised releases by TheBloke, I only see one version of 30B WizardLM (in multiple formats/quantisation sizes, plus the unofficial uncensored version).