Retrieval-Augmented Generation with Documentation

aka. RAG Doctor 🧑‍⚕️

This repository showcases a Retrieval-Augmented Generation (RAG) system for interacting with documentation that uses natural language queries to retrieve and summarize relevant information.

interactive-demo.webm

Overview

Creates a Qdrant vector database for embeddings from the given CSV file(s)
- The vector database is used for fast similarity search to find relevant documentation
- We use a CSV based on Hugging Face documentation as an example
Uses OpenAI's embeddings for similarity search and GPT models for high-quality responses
Provides an interactive interface for querying the documentation using natural language
Each query retrieves the most relevant documentation snippets for context
Answers include source links for reference

Prerequisites

Valohai account to run the pipelines
OpenAI account to use their APIs
Less than $5 in OpenAI credits

Setup

If you can't find this project in your Valohai Templates, you can set it up manually:

Create a new project on Valohai
Set the project repository to: https://github.com/valohai/rag-doc-example
Save the settings and click "Fetch Repository"

This makes sure the project is up to date
🔑 Create an OpenAI API key for this project
- We will need the API key next so record it down
Assign the API key to this project:

You will see ✅ Changes to OPENAI_API_KEY saved if everything went correctly.

And now you are ready to run the pipelines!

Usage

Navigate to the "Pipelines" tab
Click the "Create Pipeline" button
Select the "assistant-pipeline" pipeline template
Click the "Create pipeline from template" button
Feel free to look around and finally click the "Create pipeline" button

This will start the pipeline:

Feel free to explore around while it runs.

When it finishes, the last step will contain qualitative results to review:

This manual evaluation is a simplification how to validate the quality of the generated responses. "LLM evals" is a large topic outside the scope of this particular example.

Now you have a mini-pipeline that maintains a RAG vector database and allows you to ask questions about the documentation. You can ask your own questions by creating new executions based on the "do-query" step.

Next Steps

Automatic Deployment

The repository also contains a pipeline "assistant-pipeline-with-deployment" which deploys the RAG system to an HTTP endpoint after a manual human validation of the "manual-evaluation" pipeline step.

🤩 Show Me!

Create a Valohai Deployment to tell where the HTTP endpoint should be hosted:

You can use Valohai Public Cloud and valohai.cloud as the target when trialing out. Make sure to name the deployment public
Create a pipeline as we did before, but use the "assistant-pipeline-with-deployment" template.

The pipeline should look something like this.
The pipeline will halt to a "⏳️ Pending Approval" state, where you can click the "Approve" button to proceed.
After approval, the pipeline will build and deploy the endpoint.
You can use the "Test Deployment" button to run a test queries against the endpoint.

Using Other Models

This example uses OpenAI for both the embedding and query models.

Either could be changed to a different provider or a local model.

🤩 Show Me!

Changing models inside the OpenAI ecosystem is a matter of changing constants in src/rag_doctor/consts.py:

EMBEDDING_MODEL = "text-embedding-ada-002"
EMBEDDING_LENGTH = 1_536  # the dimensions of a "text-embedding-ada-002" embedding vector

PROMPT_MODEL = "gpt-4o-mini"
PROMPT_MAX_TOKENS = 128_000  # model "context window" from https://platform.openai.com/docs/models

Further modifying the chat model involves reimplementing the query logic in src/rag_doctor/query.py.

Similarly, modifying the embedding model is a matter of reimplementing the embedding logic in both src/rag_doctor/database.py and src/rag_doctor/query.py.

If you decide to change the embedding model, remember to recreate the vector database.

Using Your Own Documentation

You can take a look at the input file given to the "embedding" node and create a similar CSV from your own documentation and replace the input with that CSV.

Running it Locally

You can also run the individual pieces locally by following instructions in the DEVELOPMENT file.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github		.github
src/rag_doctor		src/rag_doctor
.env.example		.env.example
.gitignore		.gitignore
DEVELOPMENT.md		DEVELOPMENT.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock
valohai.yaml		valohai.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retrieval-Augmented Generation with Documentation

aka. RAG Doctor 🧑‍⚕️

Overview

Prerequisites

Setup

Usage

Next Steps

Automatic Deployment

Using Other Models

Using Your Own Documentation

Running it Locally

About

Languages

valohai/rag-doc-example

Folders and files

Latest commit

History

Repository files navigation

Retrieval-Augmented Generation with Documentation

aka. RAG Doctor 🧑‍⚕️

Overview

Prerequisites

Setup

Usage

Next Steps

Automatic Deployment

Using Other Models

Using Your Own Documentation

Running it Locally

About

Topics

Resources

Stars

Watchers

Forks

Languages