Download the Langchain-Chatchat with IPEX-LLM integrations from this link. Unzip the content into a directory, e.g. /home/arda/Langchain-Chatchat-ipex-llm
.
Visit the Install IPEX-LLM on Linux with Intel GPU Guide, and follow Install Prerequisites to install GPU driver, oneAPI, and Conda.
Note
Please make sure you have checked the level_zero
version as we suggested here.
Run the following commands to create a new python environment:
conda create -n ipex-llm-langchain-chatchat python=3.11
conda activate ipex-llm-langchain-chatchat
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install --pre --upgrade torchaudio==2.1.0a0 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
Note
You can also use --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
.
Switch to the root directory of Langchain-Chatchat you've downloaded (refer to the download section), and install the dependencies with the commands below. Note: In the example commands we assume the root directory is /home/arda/Langchain-Chatchat-ipex-llm
. Remember to change it to your own path.
cd /home/arda/Langchain-Chatchat-ipex-llm
pip install -r requirements_ipex_llm.txt
pip install -r requirements_api_ipex_llm.txt
pip install -r requirements_webui.txt
- In root directory of Langchain-Chatchat, run the following command to create a config:
python copy_config_example.py
- Edit the file
configs/model_config.py
, changeMODEL_ROOT_PATH
to the absolute path of the parent directory where all the downloaded models (LLMs, embedding models, ranking models, etc.) are stored.
Download the models and place them in the directory MODEL_ROOT_PATH
(refer to details in Configuration section).
Currently, we support only the LLM/embedding models specified in the table below. You can download these models using the link provided in the table. Note: Ensure the model folder name matches the last segment of the model ID following "/", for example, for THUDM/chatglm3-6b
, the model folder name should be chatglm3-6b
.
Model | Category | download link |
---|---|---|
THUDM/chatglm3-6b |
Chinese LLM | HF or ModelScope |
meta-llama/Llama-2-7b-chat-hf |
English LLM | HF |
BAAI/bge-large-zh-v1.5 |
Chinese Embedding | HF |
BAAI/bge-large-en-v1.5 |
English Embedding | HF |
Run the following commands:
conda activate ipex-llm-langchain-chatchat
source /opt/intel/oneapi/setvars.sh
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
export BIGDL_IMPORT_IPEX=0
export no_proxy=localhost,127.0.0.1
export FASTCHAT_WORKER_API_TIMEOUT=600
python startup.py -a
Note
The warmup of LLM model will be conducted when you start the first conversation. And the warmup of embedding model may happen either when you create a knowledage base or you start the first Knowledge Base QA/File Chat conversation.
Thus, please expect a several-minute warmup time during your first conversation with a LLM model, or when you create a new knowledge base with an embedding model.
You can find the Web UI's URL printted on the terminal logs, e.g. http://localhost:8501/.
Open a browser and navigate to the URL to use the Web UI.