-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: SQL Template Matcher #5
Conversation
Reiterating: Let's break the PR (draft) into 2:
|
Possible use case:
For LLM-based matcher, we can provide in-context (all-shot prompting?) list of templates (query+sql) and let the LLM give us scored output which we can use. UsagePossible usage could be from larch import SQLTemplate
from larch.search import FuzzySQLTemplateMatcher
templates = [
SQLTemplate(
query="What is the capital of Nepal?",
sql="select * from country c where c.country=="Nepal";
]
matcher = FuzzySQLTemplateMatcher(templates=templates, ...)
query = "Capital of Nepal"
matched_templates = matcher(query=query, ...) |
self.debug=debug | ||
|
||
@abstractmethod | ||
def match(self, query: str, top_k=1, **kwargs) -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's return List[SQLTemplate]
instead of List[str]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the fuzzy matcher might be able to provide response as List[SQLTemplate]
, it may not be that efficient to do with LLM based matcher. If the number of templates is huge, getting the list of sql templates with pattern and entity substituted sql query might require prompting LLM to provide a list of matching queries. I've not experimented on LLM part so I can't fully support the statement above.
I'll put more context once I get to know how it performs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the return list should still be a subset of all the templates, cut-off by threshold or top_k. So, the correct return type should which SQLTemplate
objects are returned. Hence List[SQLTemplate]
makes more sense as it gives us idea about what sort of query and intents are also being matched for input query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: With llm-based, even if the llm just gives us the SQL query, we can ideally reverse map the original sqltemplate object as well. I think the result that LLM returns could infact be enforced by in-context prompting with SQL templates. Nevertheless, let's just stick with List[SQLTemplate]
as return type because we're technically just selecting/matching the input templates
.
""" | ||
query_pattern: str | ||
sql_template: str | ||
description: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also let's add intent: Optional[str] = None
field as well in case we want the intent detection somewhere in future.
Yeah the plan is to do the same as explained in the description above. |
larch/search/template_matcher.py
Outdated
similarity_threshold: The similarity threshold to be used for fuzzy matching. | ||
""" | ||
def __init__(self, templates: List[SQLTemplate], | ||
llm: BaseLanguageModel, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this Optional[BaseLanguageModel]
and inside the constructor we can do llm = llm or ChatOpenAI(...)
@anisbhsl can you also run the code through pre-commit hooks please.
Once done, you'll be able to run the hooks automatically every time you invoke |
Another thought I have: By making use of |
Closing this for time being as apparently it's already in develop for some rason. |
This is a feature PR for SQL based template matcher. Two downstream PRs:
will be branched off this PR.