You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before we can deploy the search service, we need to work out who our integrations should be with
Right now, we are using a third-party vector database to store embeddings (Zilliz), and a third-party embeddings database to generate embeddings for queries (with open AI).
Our requirements are:
We need to encode our docs site as a set of embeddings in the database
We need to take user queries in natural language and convert them into embeddings
We need to search our database for matching vectors.
Self Hosted Database
I am convinced that we should be able to host our own vector database in the container.
The database should be built offline as part of a builder image. We can use whatever dev depenencies are needed in the builder image, and drop them for the final production image.
Once the build is complete, we don't need to write to the database again: we only need read and query capability.
We can even trigger a new Apollo build every time the docsite is updated to keep things in sync. But the doc site doesn't update THAT often so we don't really need a live sync. A weekly update would be fine.
We could I suppose use the database to case searches later (but even then, the database might not be the best way to do this).
We should choose an open source database from the options available.
I don't actually know how big the embeddings are, in memory size, for the docsite. But I doubt it's gigabytes?
Note that this means the apollo server needs to actually run queries against the DB. Up until now apollo has really just been a proxy server - from here it'll start doing its own actual work.
Third party Database
If we really can't bundle up our own database in the container, we'll need to use a third party.
We're currently using Zilliz, which is the SaS version of Milvus.
We should chose a partner which is open source, isn't too expensive, and ideally which aligns to our values.
Self Hosted Embeddings Model
In a perfect world we would keep the embeddings model in the image too. This would mean that the apollo sever needs to be big enough and powerful enough to run an LLM.
Note that the dev dependencies for the model don't need to be in the final production image - we shouldn't need to store torch and all its built in models.
I suspect that we can build an embeddings model in a builder image, then remove all the dev dependencies, and use a final model thats around 1GB in size.
We would need to be careful about which model we pick to ensure that a) its is ethically trained and b) it generates good quality embeddings. We can compare against openAI's embeddings and the existing milvus search to get a sense of how good they are.
The model would be called:
At build time, to generate embeddings for the doc site
At runtime, to generate embeddings for a user query
Third-party Embeddings Model
If we can't self host the model, we'll have to stick with a third party. This may well be appropriate, but we'd need a cost effective solution.
We currently use the openAI embeddings service. Anthropic recommends https://www.voyageai.com/, which I'd at least like to take a serious look at
The text was updated successfully, but these errors were encountered:
Before we can deploy the search service, we need to work out who our integrations should be with
Right now, we are using a third-party vector database to store embeddings (Zilliz), and a third-party embeddings database to generate embeddings for queries (with open AI).
Our requirements are:
Self Hosted Database
I am convinced that we should be able to host our own vector database in the container.
The database should be built offline as part of a builder image. We can use whatever dev depenencies are needed in the builder image, and drop them for the final production image.
Once the build is complete, we don't need to write to the database again: we only need read and query capability.
We can even trigger a new Apollo build every time the docsite is updated to keep things in sync. But the doc site doesn't update THAT often so we don't really need a live sync. A weekly update would be fine.
We could I suppose use the database to case searches later (but even then, the database might not be the best way to do this).
We should choose an open source database from the options available.
I don't actually know how big the embeddings are, in memory size, for the docsite. But I doubt it's gigabytes?
Note that this means the apollo server needs to actually run queries against the DB. Up until now apollo has really just been a proxy server - from here it'll start doing its own actual work.
Third party Database
If we really can't bundle up our own database in the container, we'll need to use a third party.
We're currently using Zilliz, which is the SaS version of Milvus.
We should chose a partner which is open source, isn't too expensive, and ideally which aligns to our values.
Self Hosted Embeddings Model
In a perfect world we would keep the embeddings model in the image too. This would mean that the apollo sever needs to be big enough and powerful enough to run an LLM.
Note that the dev dependencies for the model don't need to be in the final production image - we shouldn't need to store torch and all its built in models.
I suspect that we can build an embeddings model in a builder image, then remove all the dev dependencies, and use a final model thats around 1GB in size.
We would need to be careful about which model we pick to ensure that a) its is ethically trained and b) it generates good quality embeddings. We can compare against openAI's embeddings and the existing milvus search to get a sense of how good they are.
The model would be called:
Third-party Embeddings Model
If we can't self host the model, we'll have to stick with a third party. This may well be appropriate, but we'd need a cost effective solution.
We currently use the openAI embeddings service. Anthropic recommends https://www.voyageai.com/, which I'd at least like to take a serious look at
The text was updated successfully, but these errors were encountered: