Cant not request #216

wanzhixiao · 2024-03-24T11:15:10Z

System Info

CUDA Version: 12.0 , with A10 GPU
CentOS: 7.9

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

I run the text-embedding-inference program by the following script

volume=/data/pretrain_model
model=/data/pretrain_model/bge-small-zh-v1.5
revision=refs/pr/5

docker run -d --restart=always --gpus all\
            -p 8070:80 \
            -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:86-1.2 \
            --model-id $model \
            --tokenization-workers 2

and the docker has successfully run, however, when i try the following command for caculate embedding:

curl 127.0.0.1:8070/embed_sparse \
    -X POST \
    -d '{"inputs":"I like you."}' \
    -H 'Content-Type: application/json'

the error ocurred:
curl: (56) Recv failure: Connection reset by peer, how can i solve it?

docker logs

Expected behavior

return the embedding

CharleyXu · 2024-03-28T02:16:15Z

You can refer to #207

OlivierDehaene mentioned this issue Mar 27, 2024

Is it possible to use the local model directly without downloading it? #207

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cant not request #216

Cant not request #216

wanzhixiao commented Mar 24, 2024 •

edited

Loading

CharleyXu commented Mar 28, 2024

Cant not request #216

Cant not request #216

Comments

wanzhixiao commented Mar 24, 2024 • edited Loading

System Info

Information

Tasks

Reproduction

Expected behavior

CharleyXu commented Mar 28, 2024

wanzhixiao commented Mar 24, 2024 •

edited

Loading