You are viewing a single comment's thread from:

RE: Run even larger AI models locally with LM Studio

in LeoFinance7 months ago

Mac Studio and even Mac Minis are very popular option for LLM due to how unified memory works. Nowhere can you get ~188 VRAM for less than the cost of even a single A100 40G.

Sort:  

I'm getting 23 tokens per second using the 5 bit Mixtal 2.7 model.

macs have a big edge for this.
I would recommend the 4 bit, the 5 bit isn't much better and takes a lot more ram. I'd stick with 4 bit, or something like 8 bit if you can get there.