-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing DefaultDimensionSpec in Druid query when querying with multiple Lookup dimensions. #716
Comments
Does the request have forceDimensionDriven set to true? Try to get debug info to see if isDimDriven or forceDimensionDriven are true on request model. |
Also try running the request with forceDimensionDriven: "false" to see what the query looks like. |
Following are different boolean RequestModel variables set in our case.
forceDimensionDriven, isDimDriven set to false in our case. |
Looks like isDimDriven is the opposite of isFactDriven, So it is being set to true in our case. |
You may have identified a corner case which may not have coverage. Try setting forceFactDriven to true. |
Yes, setting the forceFactDriven variable to true helped with generating the desired Druid query. I see the variable can be set in Maha request JSON payload. Should the variable be defaulted to true since most of the queries are fact-driven. Let me know if you have any recommendations. |
Use that as a work around until we figure out what should be the right behavior in the dim driven case. Thanks for reporting it! |
Thanks for the workaround. Are there any repercussions by using the forceFactDriven on all the queries? |
@upendrareddy Unless you have a dim driven use case (e.g. entity management in the UI, e.g. Campaign Management view), they should all be fact driven so I don't see any issues with doing it for all queries. |
Got it. Thank you! |
When we query Maha with multiple lookup dimensions in the select fields we are seeing some of the fields returned as nulls in the results. The underlying druid query issued did not have these lookups listed in the default dimension specs.
To elaborate further
The issued maha query is of the following format.
Maha Query
Output
Druid Query Created by Maha
Upon debugging further I stumbled upon the variable factRequestCols(Set of Strings) at https://github.com/yahoo/maha/blob/master/core/src/main/scala/com/yahoo/maha/core/query/druid/DruidQueryGenerator.scala#L353 which is being passed on to method at https://github.com/yahoo/maha/blob/master/core/src/main/scala/com/yahoo/maha/core/query/druid/DruidQueryGenerator.scala#L381 where druid queries dimension specs are being created in getDimensions method based on factRequestCols passed.
I am not quite sure I understand the logic of factRequestCols set creation here but adgroup_id dimension is not getting included in the resulting set because of which it is not getting added to Druid query dimension spec as well.
I locally overrode the code by passing queryContext.factBestCandidate.requestCols to getDimensions method at https://github.com/yahoo/maha/blob/master/core/src/main/scala/com/yahoo/maha/core/query/druid/DruidQueryGenerator.scala#L381 and it fixed the issue and started populating the dimension spec in druid query as well as had adgroup_id values in the resulting out.
Output after the change
Could you please help me with the context of factRequestCols set creation and also let me know if the logic of the factRequestCols set creation or anything else needs to be changed to include the missing dimensions.
The text was updated successfully, but these errors were encountered: