Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the difference between context and contexts in the response output? #755

Open
n400peanuts opened this issue Dec 9, 2024 · 4 comments
Labels
question Further information is requested

Comments

@n400peanuts
Copy link

The output of docs.query() gives me a python object that contains a formatted_answer, which is the answer generated by the LLM given the input docs. I wish to know based on which context the answer has been generated. I can see that the output contains context and contexts what do these variables contain? and how do I get the verbatim context used by the LLM to generate the formatted_answer from the docs.query?

@dosubot dosubot bot added the question Further information is requested label Dec 9, 2024
Copy link

dosubot bot commented Dec 9, 2024

The context and contexts variables in the response output of docs.query() are related but serve different purposes:

  1. contexts: This variable contains a list of potential contexts retrieved from the documents. These contexts are scored and sorted based on their relevance to the query. The contexts are initially populated by retrieving relevant texts using the retrieve_texts method and are then filtered and sorted by score and name [1].

  2. context: This variable is a formatted string created from the selected contexts in contexts. It is constructed by formatting the top-scoring contexts into a single string using a specified prompt configuration. This context string is then used by the language model to generate the formatted_answer [1].

To get the verbatim context used by the LLM to generate the formatted_answer, you should look at the context variable, as it represents the final formatted string of selected contexts that were used in the prompt to generate the answer.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@n400peanuts
Copy link
Author

But @dosu: how can I retrieve the literal text from the papers that has been used to generate the context?

Copy link

dosubot bot commented Dec 9, 2024

To retrieve the literal text from the papers that has been used to generate the context, you can trace back from the contexts variable. Each Context object in the contexts list contains a text attribute, which holds the original text from the document. This text attribute is part of a Text object that includes the document's name and citation, allowing you to identify and access the original text used to form the context [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@maykcaldas
Copy link
Collaborator

Hey @n400peanuts.

Yes, you're right. docs.query returns the formatted_answer and the context used to generate that answer. However, LLMs are used to summarize the actual excerpt extracted from scientific articles.

The actual call stack is something similar to:
docs.query() --> returns the answer to the query
docs.get_evidence() --> returns the summarized evidence obtained from the literature
docs.retrieve_texts() --> returns the actual text extracted from the literature

Therefore, you can use docs.retrieve_texts() to access the literal text extracted. But this is not formatted and may be difficult to read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants