RE: LeoThread 2024-12-27 09:16

You are viewing a single comment's thread from:

RE: LeoThread 2024-12-27 09:16

ahmadmanga (74)in LeoFinance • last year

Local Chat GPT Models on Various Hardware

Dave tests running local chat GPT style large language models on different hardware, from a $50 Raspberry Pi to a $50,000 AI workstation, to compare performance and usability.

Summarized by Llama 3.3 70B Instruct Model

last year in LeoFinance by ahmadmanga (74)

$0.24

14 votes

Sort:

Trending

[-]

ahmadmanga (74) last year

Testing on a Raspberry Pi

📊 Dave installs LLaMA on a Raspberry Pi 4 with 8 GB of RAM, using Raspian.
⚠️ The model runs slowly, about one word per second, due to the Pi's lack of GPU and limited CPU power.
📝 The test shows that while the Pi can run the model, it's not practical for real-time use.

Testing on a Mini PC

🖥️ Dave tests LLaMA on a Herk Mini PC, which starts at $388 and features a Ryzen 9 7940HS chip and Radeon 780M GPU.
💻 He installs LLaMA directly on Windows and runs the 3.1 model, which performs well but doesn't utilize the GPU due to its limited memory.
📊 A smaller 3.2 model is tested, which is faster but still doesn't use the GPU, likely due to compatibility issues.

$0.00

[-]

ahmadmanga (74) last year

Testing on Higher-End Hardware

🤖 Dave moves on to a 3970x Thread Ripper with an Nvidia 4080 GPU, which runs the 3.1 model quickly and utilizes the GPU.
📊 He also tests the model on an M2 Mac Pro, which performs well and can allocate system RAM as video RAM.
🚀 Finally, he tests a 96-core Thread Ripper with an Nvidia 6000 Ada card, which struggles to run a massive 405 billion parameter model.

Conclusion

📊 The size of the model and its complexity have a significant impact on performance, regardless of the hardware used.
📈 Dave concludes that choosing the right model is crucial, and that even high-end hardware can be brought to its knees by large models.

$0.00