prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
-
Updated
Jun 23, 2025 - C++
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
Local ML voice chat using high-end models.
Local LLMs in your DAW!
A C++ implementation of Open Interpreter. / Open Interpreter 的 C++ 实现
On the Releases page, you can download pre-built binaries for arm, armv7l and Raspberry pi. LLM inference in C/C++
Add a description, image, and links to the llama-cpp topic page so that developers can more easily learn about it.
To associate your repository with the llama-cpp topic, visit your repo's landing page and select "manage topics."