Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

primary keys for dept_manager and dept_emp are too strict #20

Open
jdduncan opened this issue Jan 31, 2020 · 3 comments
Open

primary keys for dept_manager and dept_emp are too strict #20

jdduncan opened this issue Jan 31, 2020 · 3 comments
Labels
changes for 2.0 (compatibility breaking) changes related to 2.0 version enhancement help wanted

Comments

@jdduncan
Copy link

In theory, Joe could work in Marketing from 1995 to 1998, and then in Sales from 1998 until 2000, and then in Marketing again. This may not actually occur in the sample data set. But it cannot be modeled in dept_emp if the primary key is (emp_no,dept_no).

If Joe is the department manager, this problem also applies to the dept_manager table.

A looser constraint is that, on any given date, a department has only one manager, and an employee has only one job. To model this, you would need a "temporal range" as a data type that can be indexed, or at least a SQL CHECK that date ranges do not overlap.

But using (emp_no, from_date) as primary key might be a better way to express the constraint. Maybe it would also allow the primary ordered index to be used in evaluating the views.

@datacharmer
Copy link
Owner

Hi JDD,
Thanks for submitting this issue. I agree that the relationship between employees and departments should be better defined with the same granularity that we have for titles and salaries.
However, I have two reservations against changing the current table the way you suggest:

  1. this is fabricated data, and the relationships reflect the relations that were in the original data. If it were an educational dataset, we should probably make the relationship more oriented to reality, where departments are also split by team and some other grouping. Those groups should be also probably related to departments in addition of people (e.g. team X moved from dept A to dept B).
  2. This database has been in use for 12 years with minimal changes. One popular workflow for it is to download the database, run a set of pre-defined queries, and compare results with known ones. If we changed the relationship between employees and departments in such a drastic way, we would probably break a lot of code.

I have been thinking of cleaning up some of the code that is using old features and making it more readable. The improvement that you suggest would be a first step for a version 2.0 of this database, where we break compatibility but provide a better schema.

@datacharmer datacharmer added enhancement changes for 2.0 (compatibility breaking) changes related to 2.0 version labels Feb 1, 2020
@jdduncan
Copy link
Author

jdduncan commented Feb 1, 2020 via email

@0416yty
Copy link

0416yty commented Mar 20, 2024

wow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changes for 2.0 (compatibility breaking) changes related to 2.0 version enhancement help wanted
Projects
None yet
Development

No branches or pull requests

4 participants
@datacharmer @jdduncan @0416yty and others