-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Rule Request] Every dimension table should have exactly one column who is the row identifier #29
Comments
In my opinion, this rule should only apply to dimension tables, or am I missing something? |
if you use a fact table as a bridge between many to many relationships you might end up with strange results. |
Indeed - but in pure star schemas where you don't use many-to-many across a fact table, you will pay a penalty in the VertiPaq compression if you add a column containing only unique values to a table - especially if the table is big (which fact tables typically are). |
You are right. Change the title to Dimension tables should have ... ? |
Agreed. And just for reference, the way we identify a table as being a dimension table, is if the table is on the "one" side of any relationship in the model. |
Isn't then having a row identifier already implicitly enforced by VertiPaq? You can't put a table on the one side of a relationship unless it has a column with unique values, can you? And having exactly one row identifier column is not something that should be enforced. You may have two keys in a dimension, with one being used for data coming from source A, and another key used for data coming from source B -- isn't this a legitimate reason to have two keys? |
Maybe I'm misunderstanding something, but I believe @sqlenders wants a rule that checks all dimension tables (i.e., tables that only appear on the "One" side of any relationship - this is just how the Best Practice Analyzer distinguishes dimension tables from other tables), to make sure the IsKey-property has been set to true for the column participating in relationships. I'm not sure what the advantage (if any) of setting this property is, but I believe it has something to do with the implicit RowNumber column that normally exists on all tables. The TOM does not permit having more than one column in a table with this property set, but nothing prevents you from having multiple columns in a table, that contain only unique values. |
There's probably something I'm misunderstanding as well. Perhaps there's some advantage for this proposed Best Practice Analyzer rule that I'm not seeing -- keen to hear from @sqlenders. |
Sounds stupid until you find out how many tables you have without a rowidentifier/key
The text was updated successfully, but these errors were encountered: