The deloitte-insightbot
is a question and answer system designed to provide insights based on Deloitte's weekly
economic updates. These updates offer a brief overview of the global political and economic situation, summarizing key
impacts and trends.
- Data Ingestion: Fetches content from Deloitte's weekly economic update URL.
- Embeddings Storage: Stores embeddings of the content in a VectorDB.
- Retrieval-Augmented Generation: Retrieves relevant passages to generate answers for user queries using an LLM.
- Data Ingestion: A module to scrape and parse content from the specified URL.
UnstructuredURLLoader
class to fetch and parse the content from the URL.
- Embeddings Model: Utilizes an embedding model to convert content into vector representations.
OpenAIEmbeddings
model with the model nametext-embedding-3-large
.
- VectorDB: Stores the embeddings for efficient retrieval.
Chroma
class from langchain_chroma is used to interact with ChromaDB.
- LLM: Generates answers based on the retrieved passages.
- ChatOpenAI class with the model name
gpt-3.5-turbo
.
- ChatOpenAI class with the model name
- Ingest Data: Run the data ingestion script to fetch and parse the content.
- Store Embeddings: Use the embeddings model to convert the content into vectors and store them in the VectorDB.
- Query System: Input a user query to retrieve relevant passages and generate an answer using the LLM.
Install the required packages
pip install -r requirements.txt
Start the ChromaDB container
docker compose up -d
Ping the ChromaDB container to check if it is running
curl localhost:8000/api/v1/heartbeat
Run the application
python src/main.py