I develop a compact RAG (Retrieval-Augmented Generation) that runs on Raspberry Pi. As the database for RAG, I adopt SQLite and implement a vector DB using sqlite-vec.
- Develop a compact RAG that runs on Raspberry Pi, supporting Hybrid RAG: SQL DB and Vector DB.
- The RAG also works as an API server for my other projects: virtual-showroom and node-red-ai-agents.
- OpenAI API key
- LLM model: gpt-4o-mini
- Embeddings model: text-embedding-3-small
- Raspberry Pi
Brain
[OpenAI API service]
Unity app |
[VirtualShowroom]-----+ |
| |
Web apps | Compact RAG (app.py)
[Web Browser]---------+------- [Raspberry Pi]---+---USB---[Camera with mic]
| | |
AI Agents | SQLite DB +---USB---[Speaker]
[Node-RED]------------+
$ git clone https://github.com/asg017/sqlite-vec
$ cd sqlite-vec
$ sudo apt-get install libsqlite3-dev
$ make loadable
Find "vec0.so" in ./dist directory.
- Step 1. Generating Chunks: I run this notebook on my Mac.
- Step 2. Calculating embeddings: I run this script on my Raspberry Pi 3.
- cx package ... Python package
- API server ... API Server
$ cd app
$ python app.py
The API server provides simple web apps. Access "http://<IP address of the API server>:5050" with a web browser.
virtual-showroom uses this API server to access the OpenAI API service.
Refer to this article to start the server automatically.
A sample service file is like this:
[Unit]
Description=Python Generative AI API server
After=network.target
[Service]
ExecStart=/usr/bin/python3 -m app --directory <Path to "app" folder>
WorkingDirectory=<Path to "app" folder>
Restart=always
RestartSec=10
User=<Your user name>
Group=users
Environment=PYTHONPATH=<Path to this repo on Raspberry Pi>:$PYTHONPATH OPENAI_API_KEY=<OpenAI API key>
[Install]
WantedBy=multi-user.target
After having created the service file, do this:
$ sudo systemctl daemon-reload
$ sudo systemctl start gen_ai.service
Confirm the daemon process running:
$ sudo systemctl start gen_ai.service
If something wrong happened, check the syslog:
$ tail /var/log/syslog