Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(wren-ai-service): prompting to avoid column ambiguity #906

Merged
merged 1 commit into from
Nov 13, 2024

Conversation

paopa
Copy link
Member

@paopa paopa commented Nov 13, 2024

This PR updates the text-to-SQL rules to enforce table alias qualification for column names, improving query clarity and preventing ambiguity in generated SQL queries.

Changes

Key updates to the SQL generation rules:

  • Added rule: "ALWAYS QUALIFY column names with their table name or table alias"
  • Modified a few example to mention the matter

Before:

SELECT SUM(PriceSum) FROM Revenue 
WHERE CAST(PurchaseTimestamp AS TIMESTAMP WITH TIME ZONE) >= ...

After:

SELECT SUM(r.PriceSum) FROM Revenue r 
WHERE CAST(r.PurchaseTimestamp AS TIMESTAMP WITH TIME ZONE) >= ...

Summary

After the prompt optimization, the sql will use table or alias to avoid column ambiguity, and we could observe the issue whether this issue occurs again.

SELECT "olist_customers_dataset"."customer_city" AS "city", "olist_orders_dataset"."order_id", SUM("olist_order_items_dataset"."price") AS "total_value" FROM "olist_orders_dataset" JOIN "olist_order_items_dataset" ON "olist_orders_dataset"."order_id" = "olist_order_items_dataset"."order_id" JOIN "olist_customers_dataset" ON "olist_orders_dataset"."customer_id" = "olist_customers_dataset"."customer_id" GROUP BY "olist_customers_dataset"."customer_city", "olist_orders_dataset"."order_id" ORDER BY "city", "total_value" DESC LIMIT 3

@paopa paopa added module/ai-service ai-service related ci/ai-service ai-service related labels Nov 13, 2024
@paopa paopa marked this pull request as ready for review November 13, 2024 11:09
@paopa paopa requested a review from cyyeh November 13, 2024 11:09
@cyyeh cyyeh merged commit bac4e90 into main Nov 13, 2024
11 checks passed
@cyyeh cyyeh deleted the feat/prompt-avoid-column-ambiguity branch November 13, 2024 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/ai-service ai-service related module/ai-service ai-service related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants