llama-local for the Cheshire Cat AI (NVIDIA only)

This is an adaptation of llama-cpp-python (link) to be easily launched from docker-compose and with an NVIDIA GPU.

Clone repo:

git clone https://github.com/cheshire-cat-ai/llama-local.git

Create your .env based on the provided example:

cp .env.example .env

Download the model of your choice (GGML format, many LLAMA versions are available here)

Place your .bin model in the models folder

MODEL_NAME in .env should match the filename of your LLAMA.

Launch the container:

docker compose up

Now go to http://localhost:8000/docs to try out the endpoints

TODO: instructions on how to configure the cat

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
models		models
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
env.example		env.example
llamacpp-compose-cpu.yml		llamacpp-compose-cpu.yml
llamacpp-compose-gpu.yml		llamacpp-compose-gpu.yml

Provide feedback