Skip to content

Latest commit

 

History

History

vector-streaming-search

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
#Vespa

Vespa Vector Streaming Search

This sample application is used to demonstrate vector streaming search with Vespa. This was introduced in Vespa 8.181.15. Read the blog post announcing vector streaming search. See Streaming Search for more details.

The application uses a small synthetic sample of mail documents for two fictive users. The subject and content of a mail are combined and embedded into a 384-dimensional embedding space, using a Bert embedder.

Quick start

The following is a quick recipe for getting started with this application.

  • Docker Desktop installed and running. 4 GB available memory for Docker is recommended. Refer to Docker memory for details and troubleshooting
  • Alternatively, deploy using Vespa Cloud
  • Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
  • Architecture: x86_64 or arm64
  • Homebrew to install Vespa CLI, or download a vespa cli release from GitHub releases.

Validate Docker resource settings, should be minimum 4 GB:

$ docker info | grep "Total Memory"
or
$ podman info | grep "memTotal"

Install Vespa CLI:

$ brew install vespa-cli

For local deployment using docker image:

$ vespa config set target local

Pull and start the Vespa docker container image:

$ docker pull vespaengine/vespa
$ docker run --detach --name vespa --hostname vespa-container \
  --publish 127.0.0.1:8080:8080 --publish 127.0.0.1:19071:19071 \
  vespaengine/vespa

Verify that configuration service (deploy api) is ready:

$ vespa status deploy --wait 300

Download this sample application:

$ vespa clone vector-streaming-search my-app && cd my-app

Deploy the application :

$ vespa deploy --wait 300

Deployment note

It is possible to deploy this app to Vespa Cloud.

Feeding sample mail documents

During feeding the subject and content of a mail document are embedded using the Bert embedding model. This is computationally expensive for CPU. For production use cases, use Vespa Cloud with GPU instances and autoscaling enabled.

$ vespa feed ext/docs.json

Query and ranking examples

The following uses Vespa CLI to execute queries. Use -v to see the curl equivalent using HTTP API.

Exact nearest neighbor search

$ vespa query 'yql=select * from sources * where {targetHits:10}nearestNeighbor(embedding,qemb)' \
  'input.query(qemb)=embed(events to attend this summer)' \
  'streaming.groupname=1234'

This searches all documents for user 1234, and returns the ten best documents according to the angular distance between the document embedding and the query embedding.

Exact nearest neighbor search with timestamp filter

$ vespa query 'yql=select * from sources * where {targetHits:10}nearestNeighbor(embedding,qemb) and timestamp >= 1685577600' \
  'streaming.groupname=1234' \
  'input.query(qemb)=embed(events to attend this summer)'

This query only returns documents that are newer than 2023-06-01.

Exact nearest neighbor search with content filter

$ vespa query 'yql=select * from sources * where {targetHits:10}nearestNeighbor(embedding,qemb) and content contains "sofa"' \
  'streaming.groupname=5678' \
  'input.query(qemb)=embed(list all order confirmations)'

This query only returns documents that match "sofa" in the content field.

Cleanup

Tear down the running container:

$ docker rm -f vespa