-
Notifications
You must be signed in to change notification settings - Fork 111
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
516170c
commit c1c1ee2
Showing
1 changed file
with
148 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,148 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Fastembed Multi-GPU Tutorial\n", | ||
"This tutorial demonstrates how to leverage multi-GPU support in Fastembed. Fastembed supports embedding text and images utilizing modern GPUs for acceleration. Let's explore how to use Fastembed with multiple GPUs step by step." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Prerequisites\n", | ||
"To get started, ensure you have the following installed:\n", | ||
"- Python 3.8 or later\n", | ||
"- Fastembed (`pip install fastembed-gpu`)\n", | ||
"- Refer to [this](https://github.com/qdrant/fastembed/blob/main/docs/examples/FastEmbed_GPU.ipynb) tutorial if you have issues in GPU dependencies\n", | ||
"- Access to a multi-GPU server" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Multi-GPU using cuda argument with TextEmbedding Model" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from fastembed import TextEmbedding\n", | ||
"\n", | ||
"# define the documents to embed\n", | ||
"docs = [\"hello world\", \"flag embedding\"]\n", | ||
"\n", | ||
"# define gpu ids\n", | ||
"device_ids = [0, 1]\n", | ||
"\n", | ||
"# initialize a TextEmbedding model using CUDA\n", | ||
"embedding_model = TextEmbedding(\n", | ||
" model_name=\"sentence-transformers/all-MiniLM-L6-v2\",\n", | ||
" cuda=True,\n", | ||
" device_ids=device_ids,\n", | ||
" cache_dir=\"models\",\n", | ||
")\n", | ||
"\n", | ||
"# generate embeddings\n", | ||
"embeddings = embedding_model.embed(docs, parallel=len(device_ids))\n", | ||
"print(embeddings)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In this snippet:\n", | ||
"- `cuda=True` enables GPU acceleration.\n", | ||
"- `device_ids=[0, 1]` specifies GPUs to use. Replace `[0, 1]` with your available GPU IDs." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Multi-GPU using cuda argument with ImageEmbedding" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from io import BytesIO\n", | ||
"\n", | ||
"import requests\n", | ||
"from PIL import Image\n", | ||
"from fastembed import ImageEmbedding\n", | ||
"\n", | ||
"\n", | ||
"# load sample image\n", | ||
"images = [Image.open(BytesIO(requests.get(\"https://qdrant.tech/img/logo.png\").content))]\n", | ||
"\n", | ||
"# define gpu ids\n", | ||
"device_ids = [0, 1]\n", | ||
"\n", | ||
"# initialize ImageEmbedding model\n", | ||
"image_model = ImageEmbedding(\n", | ||
" model_name=\"Qdrant/clip-ViT-B-32-vision\", cuda=True, device_ids=device_ids, cache_dir=\"models\"\n", | ||
")\n", | ||
"\n", | ||
"# generate image embeddings\n", | ||
"image_embeddings = image_model.embed(images, parallel=len(device_ids))\n", | ||
"print(image_embeddings)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Multi-GPU using Provider Options\n", | ||
"For advanced users, Fastembed allows customization of ONNX runtime providers." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# customize provider options for CUDA\n", | ||
"first_device_id = 0\n", | ||
"second_device_id = 1\n", | ||
"providers = [\n", | ||
" (\"CUDAExecutionProvider\", {\"device_id\": first_device_id}),\n", | ||
" (\"CUDAExecutionProvider\", {\"device_id\": second_device_id}),\n", | ||
"]\n", | ||
"\n", | ||
"# initialize model with custom providers\n", | ||
"custom_model = TextEmbedding(\n", | ||
" model_name=\"sentence-transformers/all-MiniLM-L6-v2\", providers=providers, cache_dir=\"models\"\n", | ||
")\n", | ||
"\n", | ||
"# generate embeddings\n", | ||
"custom_embeddings = custom_model.embed(docs, parallel=2)\n", | ||
"print(custom_embeddings)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": ".venv", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"name": "python", | ||
"version": "3.10.15" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |