-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: SQL Template Matcher #5
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
from abc import ABC, abstractmethod | ||
|
||
from typing import Any, List, Optional | ||
from langchain.base_language import BaseLanguageModel | ||
from langchain.chat_models import ChatOpenAI | ||
|
||
from ..schema import SQLTemplate | ||
|
||
class SQLTemplateMatcher(ABC): | ||
""" | ||
SQLTemplateMatcher is a base class for all SQL based template matchers. | ||
""" | ||
def __init__(self, | ||
templates: List[SQLTemplate], | ||
similarity_threshold: float = 0.4, | ||
debug: bool = False) -> None: | ||
self.templates = templates | ||
self.similarity_threshold = similarity_threshold | ||
self.debug=debug | ||
|
||
@abstractmethod | ||
def match(self, query: str, top_k=1, **kwargs) -> List[str]: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's return There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While the fuzzy matcher might be able to provide response as I'll put more context once I get to know how it performs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean the return list should still be a subset of all the templates, cut-off by threshold or top_k. So, the correct return type should which There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Re: With llm-based, even if the llm just gives us the SQL query, we can ideally reverse map the original sqltemplate object as well. I think the result that LLM returns could infact be enforced by in-context prompting with SQL templates. Nevertheless, let's just stick with |
||
""" | ||
Match the given query against the templates. | ||
|
||
Args: | ||
query: The query to match against the templates. | ||
top_k: The number of top-k templates to return. Defaults to 1. | ||
Returns: | ||
A list of top-k templates that match the query with entity substitution. | ||
""" | ||
raise NotImplementedError() | ||
|
||
|
||
def __call__(self, *args: Any, **kwds: Any) -> List[str]: | ||
return self.match(*args, **kwds) | ||
|
||
|
||
class FuzzySQLTemplateMatcher(SQLTemplateMatcher): | ||
""" | ||
FuzzySQLTemplateMatcher is a SQL based template matcher that uses fuzzy matching. | ||
Given a query, it will use rule-based matching to find best matching template | ||
and return the template(s) with entity substitution. | ||
|
||
Args: | ||
templates: A list of SQL templates. | ||
similarity_threshold: The similarity threshold to be used for fuzzy matching. | ||
""" | ||
def __init__(self, templates: List[SQLTemplate], | ||
similarity_threshold: float = 0.4, | ||
debug: bool = False) -> None: | ||
super().__init__(templates=templates, | ||
similarity_threshold=similarity_threshold, | ||
debug = debug) | ||
|
||
def match(self, query: str, top_k=1, **kwargs) -> List[str]: | ||
pass | ||
|
||
|
||
class LLMBasedSQLTemplateMatcher(SQLTemplateMatcher): | ||
""" | ||
LLMBasedSQLTemplateMatcher uses LLM to find the best matching template. | ||
Given a query, it will extract the key entities and use LLM to find best SQL template | ||
and generates a subsituted SQL query. | ||
|
||
Args: | ||
templates: A list of SQL templates. | ||
ddl_schema: The DDL schema for available tables. | ||
similarity_threshold: The similarity threshold to be used for fuzzy matching. | ||
""" | ||
def __init__(self, templates: List[SQLTemplate], | ||
NISH1001 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
llm: BaseLanguageModel, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Make this |
||
ddl_schema: Optional[str] = None, | ||
similarity_threshold: float = 0.4, | ||
debug: bool = False) -> None: | ||
super().__init__( | ||
templates=templates, | ||
similarity_threshold=similarity_threshold, | ||
debug = debug) | ||
|
||
self.llm = llm or ChatOpenAI(temperature=0.0, model="gpt-3.5-turbo") | ||
self.ddl_schema = ddl_schema | ||
|
||
def match(self, query: str, top_k = 1, **kwargs) -> List[str]: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also let's add
intent: Optional[str] = None
field as well in case we want the intent detection somewhere in future.