This server can accept various media as input and performs various AI tasks, such as image captioning. It features an extensible plugin system that allows new tasks and models to be easily added. The purpose of this server is to maintain a uniform input / output specification for each AI task, regardless of the specifics of the model used. This allows models to be swapped more easily.
This is the recommended way to install the server.
-
Clone the Repository
-
Install Python 3.11
-
Install Flit
python3 -m pip install flit
-
Install the Core Project and some core plugins
cd src/core flit install cd ../simple_plugin_manager flit install cd ../fastapi flit install cd ../base_api flit install
-
Install any desired Plugins
cd src/example_plugin flit install
Remember that for a model to be usable at least one task plugin must also be installed. (For example text-embedding may be installed to use open-clip-vit-b32.)
-
Run the Server
run-fes --port 8888 --log-level=DEBUG
This is the recommended way to install the server if you are developing the FES.
-
Clone the Repository
-
Install Python 3.11
-
Install the Required Packages:
pip install -r dev_requirements.txt
Note: This includes all the dependencies for all plugins, so this can take a while. Alternatively, you can only install the dependencies you require as needed. (Check the pyproject.toml files of the plugins you need.)
-
Add the Desired Plugins to the Path For example:
export PYTHONPATH=src/core:src/legacy_api:src/audio_diarization:src/blip:src/conditional_image_captioning:src/face_embedding:src/image_captioning:src/optical_character_recognition:src/simple_plugin_manager:src/vit_gpt2:src/automated_speech_recognition:src/blip2:src/detr_resnet101:src/face_recognition:src/image_embedding:src/owl_vit_base_patch32:src/tesseract:src/whisper:src/base_api:src/clip_vit_large_patch14:src/easy_ocr:src/fastapi:src/object_detection:src/pyannote:src/text_embedding:src/zero_shot_image_classification:$PYTHONPATH export LOG_LEVEL=DEBUG
-
Run the Server:
To run the server, use the entrypoint:
python run_dev_server.py --port 8888
NOTE: currently, the vitrivr username on docker hub does not have the correct images. Use the faberf username instead.
Follow these steps to run the server using Docker:
-
Install Docker:
You need to have Docker installed on your machine. You can download Docker Desktop for Mac or Windows here. For Linux users, Docker Engine is the appropriate version, and the installation instructions vary by distribution.
-
Build or Pull the Docker Image:
The dockerfile requires the build arguments PLUGINPATH and CMD_ENTRYPOINT to build an image.
docker buildx build \ --platform linux/amd64,linux/arm64,linux/arm/v7 \ --build-arg PLUGINPATH="core:simple_plugin_manager:base_api:fastapi" \ --build-arg CMD_ENTRYPOINT="run-fes --port 8888 --host 0.0.0.0" \ --tag "featureextractionserver:my_custom_tag" \ --push \ .
You can check build_docker.sh for more examples. Alternatively you can use a prebuilt image from docker hub. Choose a tag from docker hub https://hub.docker.com/r/vitrivr/featureextractionserver/tags. For example, if you want to have pull an image from Docker hub with all plugins installed, use the following command:
docker pull vitrivr/featureextractionserver:full
-
Run the Docker Image:
After pulling the image, you can run it using the following command:
sudo docker run -it -p 5000:8888 -v ~/.cache:/root/.cache -v ./logs:/app/logs -e LOG_LEVEL=DEBUG -t vitrivr/featureextractionserver:full
This command will start a Docker container from the image and map port 5000 of your machine to port 5000 of the Docker container. This also demonstrates how you can bind the
/root/.cache
directory to your local.cache
directory in order to persist the downloaded machine learning models between runs, saving time. -
Access the Server:
You should now be able to access the server at
http://localhost:5000
. If you are using Docker Toolbox (generally for older systems), the Docker IP will likely be something other thanlocalhost
, typically192.168.99.100
. In this case, the server will be accessible athttp://192.168.99.100:5000
.
Note: To stop the Docker container, press CTRL | C
in the terminal window. If that does not work, open a new terminal window and run docker ps
to get the CONTAINER_ID
, and then run docker stop CONTAINER_ID
to stop the container.
Note: If the server crashes, then it likely ran out of memory. If you're running on Docker Desktop, you can increase the memory allocated to Docker in Docker's preferences:
- For Mac: Docker menu > Preferences > Resources > Memory
- For Windows: Docker menu > Settings > Resources > Memory
32 GB is a good amount. This will only work if your host machine has enough free memory.
Settings can be set using either a command line argument (CLA) an environment variable (EV) or an .env
file (EF).
The following settings come from the core plugin.
Name | Command Line Argument | Description |
---|---|---|
LOG_LEVEL | --log-level | The log level. Must be one of ['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] |
LOG_PATH | --log-path | The path to the log file. |
DEFAULT_CONSUMER_TYPE | --default-consumer-type | The default consumer type. Must be one of ['single_thread_consumer'] |
Additional settings may be defined or required in other plugins.
The server is extensible through plugins.
For creating new plugins which use the extraction backend to create new interfaces (such as CLIs etc), see here.
For creating new plugins which add new endpoints to the REST API, see the fast api plugin readme.
For creating new tasks, see here.
For creating new models, see here.
- See this page for more information on configuring the FastAPI REST server
- This plugin extends the server with endpoints that allow the user to create new jobs and get results.
- This plugin defines simpler endpoints that allow the user to both create new jobs and get results in a single call.
Installing these plugins adds tasks to the server which can be accessed through a variety of interfaces (see above). To properly use them, compatible models must also be installed.
- Audio Diarization
- Automated Speech Recognition
- Conditional Image Captioning
- Face Embedding
- Image Captioning
- Image Embedding
- Object Detection
- Optical Character Recognition
- Text Embedding
- Text Query Embedding
- Zero Shot Image Classification
## TODO Accounts